Introduction
This is the central place for hyphenation patterns in TEX. They are all bundled in a single package called hyph-utf8.
For pattern authors
If you are a pattern author and wish to update your patterns, please contact the hyph-utf8 package maintainers through the tex-hyphen mailing list .
Documentation
Algorithm
Papers
- Documentation (needs improvement)
- Documentation for the Lua(La)TEX part of the package
- TUG 2008 paper
- TUG 2016 paper
- The latest public hyphenation exception log from TUGboat, volume 39 (2018), no. 2
Slides
- TEX hyphenation applied to HTML (Mathias Nater, BachoTEX 2010)
Related packages
- Babel – for pdfTEX and other 8-bit TEX engines
- Polyglossia – for XƎTEX and LuaTEX
Links
Collaboration
- Mozilla
- FOP XML Hyphenation patterns (Simon Pepping)
- TEX-Hyphen-Pattern (Perl implementation on CPAN (Roland von Ipenburg)
- Hyphenator.js (Client-side implementation of hyphenation in HTML documents) (Mathias Nater)
OpenOffice.org
- Test TEX/OpenOffice hyphenation algorithm online (based on hunspell)
- Using TEX hyphenation patterns in OpenOffice.org (explains how to properly convert TEX patterns into OpenOffice-friendly form)
Other external links
- Hunspell (library)
- Open Office language extensions
- text-hyphen (rubyforge); (source code repository)
- TEX Hyphenator in Java
- Knuth-Liang Hyphenation for Haskell
- Knuth-Liang’s original, and László’s extended hyphenation for Rust
- WordPress wp-Typography plugin, with hyphenation as a central feature
- Indic languages:
- An article about the soft hyphen
- TEX line breaking algorithm in JavaScript
Languages
The list of supported languages is in the table below. Note that German and Spanish have additional documentation in a separate file.
(if patterns for any other language exist and are missing below please let us know)
| name, synonyms | BCP 47 tag (link to file) | (left, right)hyphenmin | 8-bitencoding | licence | authors | |
|---|---|---|---|---|---|---|
| Afrikaans | afrikaans | af | (1, 2) | EC | LPPL | Tilla Fick Chris Swanepoel |
| Ancient Greek | ancientgreek | grc | (1, 1) | LPPL | Dimitrios Filippou | |
| ibycus | grc-x-ibycus | (2, 2) | custom | Peter Heslin | ||
| Arabic | arabic | ar | (, ) | MIT | ||
| Armenian | armenian | hy | (1, 2) | LGPL | Sahak Petrosyan | |
| Assamese | assamese | as | (1, 1) | MIT, LGPL, GPL | Santhosh Thottingal | |
| Basque | basque | eu | (2, 2) | EC | custom | Juan M. Aguirregabiria |
| Belarusian | belarusian | be | (2, 2) | T2A | MIT | Maksim Salau |
| Bengali | bengali | bn | (1, 1) | MIT, LGPL, GPL | Santhosh Thottingal | |
| Bulgarian | bulgarian | bg | (2, 2) | T2A | custom | Anton Zinoviev |
| Catalan | catalan | ca | (2, 2) | EC | LPPL | Gonçal Badenes Francina Turon |
| Chinese | pinyin | zh-latn-pinyin | (1, 1) | EC | GPL | Werner Lemberg |
| Church Slavic | churchslavonic | cu | (1, 2) | MIT | Aleksandr Andreev Mike Kroutikov |
|
| Coptic | coptic | cop | (1, 1) | MIT | Claudio Beccari | |
| Croatian | croatian | hr | (2, 2) | EC | LPPL, custom | Igor Marinović |
| Czech | czech | cs | (2, 3) | EC | GPL | Pavel Ševeček |
| Danish | danish | da | (2, 2) | EC | LPPL, MIT | Frank Jensen |
| Dutch | dutch | nl | (2, 2) | EC | LPPL | Piet Tutelaers |
| English | ukenglish, british, UKenglish | en-gb | (2, 3) | ASCII | MIT | Dominik Wujastyk Graham Toal |
| usenglishmax | en-us | (2, 3) | ASCII | custom | Gerard D.C. Kuiken | |
| Esperanto | esperanto | eo | (2, 2) | IL3 | LPPL | Sergei B. Pokrovsky |
| Estonian | estonian | et | (2, 3) | EC | MIT, LPPL | Enn Saar |
| Ethiopic | ethiopic, amharic, geez | mul-ethi | (1, 1) | MIT, custom, custom | Mojca Miklavec |
|
| Finnish | finnish | fi | (2, 2) | EC | custom | Kauko Saarinen |
| French | french, patois, francais | fr | (2, 2) | EC | MIT | Daniel Flipo Bernard Gaulle Arthur Reutenauer |
| Friulian | friulan | fur | (2, 2) | EC | MIT, LPPL | Claudio Beccari |
| Galician | galician | gl | (2, 2) | EC | LPPL | Javier A. Múgica |
| Georgian | georgian | ka | (1, 2) | T8M | LPPL | Levan Shoshiashvili |
| German | german | de-1901 | (2, 2) | EC | MIT | Deutschsprachige Trennmustermannschaft |
| ngerman | de-1996 | (2, 2) | EC | MIT | Deutschsprachige Trennmustermannschaft | |
| swissgerman | de-ch-1901 | (2, 2) | EC | MIT | Deutschsprachige Trennmustermannschaft | |
| Gujarati | gujarati | gu | (1, 1) | MIT, LGPL, GPL | Santhosh Thottingal | |
| Hindi | hindi | hi | (1, 1) | MIT, LGPL, GPL | Santhosh Thottingal | |
| Hungarian | hungarian | hu | (2, 2) | EC | MPL, GPL, LGPL | Bence Nagy |
| Icelandic | icelandic | is | (2, 2) | EC | LPPL | Jörgen Pind |
| Interlingua | interlingua | ia | (2, 2) | ASCII | LPPL | Peter Kleiweg |
| Irish | irish | ga | (2, 3) | EC | GPL, MIT | Kevin P. Scannell |
| Italian | italian | it | (2, 2) | ASCII | LPPL, MIT | Claudio Beccari |
| Kannada | kannada | kn | (1, 1) | MIT, LGPL, GPL | Santhosh Thottingal | |
| Kurdish | kurmanji | kmr | (2, 2) | EC | LPPL | Jörg Knappen Medeni Shemdê |
| Latin | latin | la | (2, 2) | EC | MIT, LPPL | Claudio Beccari |
| classiclatin | la-x-classic | (2, 2) | ASCII | MIT, LPPL | Claudio Beccari | |
| liturgicallatin | la-x-liturgic | (2, 2) | EC | MIT | Claudio Beccari Monastery of Solesmes Élie Roux |
|
| Latvian | latvian | lv | (2, 2) | L7X | LGPL, GPL | Janis Vilims |
| Lithuanian | lithuanian | lt | (2, 2) | L7X | MIT | Vytas Statulevičius Sigitas Tolušis Yannis Haralambous |
| Malay | indonesian | id | (2, 2) | ASCII | GPL | Jörg Knappen Terry Mart |
| Malayalam | malayalam | ml | (1, 1) | MIT, LGPL, GPL | Santhosh Thottingal | |
| Marathi | marathi | mr | (1, 1) | MIT, LGPL, GPL | Santhosh Thottingal | |
| Modern Greek | monogreek | el-monoton | (1, 1) | LPPL | Dimitrios Filippou | |
| greek, polygreek | el-polyton | (1, 1) | LPPL | Dimitrios Filippou | ||
| Mongolian | mongolian | mn-cyrl | (2, 2) | T2A | LPPL, MIT | Dorjgotov Batmunkh |
| mongolianlmc | mn-cyrl-x-lmc | (2, 2) | LMC | custom | Oliver Corff Dorjpalam Dorj |
|
| Norwegian | bokmal, norwegian, norsk | nb | (2, 2) | EC | custom | Rune Kleveland Ole Michael Selberg Karl Ove HuftHammer |
| nynorsk | nn | (2, 2) | EC | custom | Karl Ove Hufthammer Rune Kleveland Ole Michael Selberg |
|
| no | (2, 2) | custom | Rune Kleveland Ole Michael Selberg |
|||
| Occitan | occitan | oc | (2, 2) | EC | MIT, LPL | Claudio Beccari |
| Oriya | oriya | or | (1, 1) | MIT, LGPL, GPL | Santhosh Thottingal | |
| Pali | pali | pi | (1, 2) | MIT | Wie-Ming Cittānurakkho Bhikkhu | |
| Panjabi | panjabi | pa | (1, 1) | MIT, LGPL, GPL | Santhosh Thottingal | |
| Persian | farsi, persian | fa | (, ) | MIT | ||
| Piemontese | piedmontese | pms | (2, 2) | ASCII | MIT, LPPL | Claudio Beccari |
| Polish | polish | pl | (2, 2) | QX | MIT, custom | Hanna Kołodziejska Bogusław Jackowski Marek Ryćko |
| Portuguese | portuguese, portuges | pt | (2, 3) | EC | BSD 3-clause licence | Pedro J. de Rezende J. Joao Dias Almeida |
| Romanian | romanian | ro | (2, 2) | EC | custom | Adrian Rezus |
| Romansh | romansh | rm | (2, 2) | ASCII | MIT, LPPL | Claudio Beccari |
| Russian | russian | ru | (2, 2) | T2A | LPPL | Alexander I. Lebedev |
| Sanskrit | sanskrit | sa | (1, 3) | custom | Yves Codet | |
| Serbian | serbianc | sh-cyrl | (2, 2) | T2A | LPPL | Dejan Muhamedagić |
| serbian | sh-latn | (2, 2) | EC | LPPL | Dejan Muhamedagić | |
| Slovak | slovak | sk | (2, 3) | EC | GPL | Jana Chlebíková |
| Slovenian | slovenian, slovene | sl | (2, 2) | EC | LPPL, MIT | Matjaž Vrečko |
| Spanish | spanish, espanol | es | (2, 2) | EC | MIT/X11 | Javier Bezos |
| Swedish | swedish | sv | (2, 2) | EC | LPPL | Jan Michael Rynning |
| Tamil | tamil | ta | (1, 1) | MIT, LGPL, GPL | Santhosh Thottingal | |
| Telugu | telugu | te | (1, 1) | MIT, LGPL, GPL | Santhosh Thottingal | |
| Thai | thai | th | (2, 3) | LTH | LPPL | Theppitak Karoonboonyanan |
| Turkish | turkish | tr | (2, 2) | EC | LPPL | Pierre A. MacKay H. Turgut Uyar S. Ekin Kocabas Mojca Miklavec |
| Turkmen | turkmen | tk | (2, 2) | EC | MIT | Nazar Annagurban |
| Ukrainian | ukrainian | uk | (2, 2) | T2A | LPPL | Maksym Polyakov |
| Upper Sorbian | uppersorbian | hsb | (2, 2) | EC | LPPL | Eduard Werner |
| Welsh | welsh | cy | (2, 3) | EC | LPPL, MIT | Yannis Haralambous |