Difference between revisions of "Indic Scripts"

From Wiki
Jump to navigation Jump to search
 
(19 intermediate revisions by the same user not shown)
Line 3: Line 3:
 
= Fonts =
 
= Fonts =
  
Fonts are complicated. Fonts for indic languages have to provide for rules for the formation of several complicated conjuncts. Naturally, there are many  
+
Fonts are complicated. Moreover, those for indic languages have to provide for rules for the formation of several complicated conjuncts. Each of these conjuncts can consist of several forms occurring simultaneously, which have to be positioned correctly relative to the base glyph. OTF documentation notwithstanding, font designers have their own interpretations of the specifications leading to a variety of implementations of the font features. Most of the available fonts are tested against Harfbuzz and/or ICU (sometimes only the former). Since ConTeXt uses its own OTF loading system, many indic fonts do not just work right away.
  
The following fonts have been tested for use with ConTeXt:
+
In early 2022, Hans made some nice improvements to the indic font system in ConTeXt. This was accompanied by some testing with various available fonts for some indic languages. As a result, there is an improved support for Indic fonts in ConTeXt. Various typescripts were then bundled into the ConTeXt distribution for easy use in documents.
  
{|class="wikitable"
+
The following table lists fonts that have been tested for use with ConTeXt. The list is by no means extensive: most of the fonts are relatively new and some (ubiquitous) old fonts are absent. If you find a font missing in this list and that works well in ConTeXt, please add it to the list. To use indic fonts place
! Sans  
+
<texcode>
! Serif
+
\usetypescriptfile[indic]
! Notes
+
</texcode>
|-
+
in the document head and use {{cmd|definetypeface}} and/or {{cmd|setupbodyfont}} with the following typescripts:
! colspan="3" | Devanagari
+
 
 +
{|cellpadding="5"
 +
|-style="background-color:#e1effa;"
 +
! style="width:17.5%;" | Sans  
 +
! style="width:17.5%;" | Serif
 +
! style="width:35%;" | Notes
 +
! Typescript(s)
 +
|-style="background-color:#fef6e7;"
 +
! colspan="4" | Devanagari
 
|-
 
|-
 
|  
 
|  
 
| [https://adishila.com/fonts/ Adishila]
 
| [https://adishila.com/fonts/ Adishila]
| 4 different designs; many weights; good conjunct coverage; IAST support
+
| 4 different designs; many weights and styles; good conjunct coverage; IAST support
 +
| <code>adishila</code></br>
 +
<code>adishila-semibold</code></br>
 +
<code>adishila-heavy</code></br>
 +
<code>adishila-dev</code></br>
 +
<code>adishila-dev-guru</code></br>
 +
<code>adishila-san</code></br>
 +
<code>adishila-san-letterpress</code></br>
 
|-
 
|-
 
|  
 
|  
 
| [https://github.com/Sandhi-IITBombay/Shobhika Shobhika]
 
| [https://github.com/Sandhi-IITBombay/Shobhika Shobhika]
 
| two weights; good conjunct coverage; IAST support; some maths support
 
| two weights; good conjunct coverage; IAST support; some maths support
 +
| <code>shobhika</code>
 
|-
 
|-
 
| [https://github.com/EkType/Baloo2 Baloo]
 
| [https://github.com/EkType/Baloo2 Baloo]
 
|  
 
|  
 
| five weights
 
| five weights
 +
| <code>baloo</code></br>
 +
<code>baloo-semibold</code></br>
 +
<code>baloo-extrabold</code></br>
 
|-
 
|-
 
|
 
|
 
| [https://github.com/etunni/Amita Amita]
 
| [https://github.com/etunni/Amita Amita]
 
| calligraphic style  
 
| calligraphic style  
 +
| <code>amita</code>
 
|-
 
|-
 
|
 
|
 
| [https://github.com/EkType/Jaini Jaini, Jaini Purva]
 
| [https://github.com/EkType/Jaini Jaini, Jaini Purva]
 
| fonts with calligraphic style commonly found in Jaina kalpasūtra manuscripts
 
| fonts with calligraphic style commonly found in Jaina kalpasūtra manuscripts
|-
+
| <code>jaini</code></br>
! colspan="3" | Malayalam
+
<code>jaini-purva</code></br>
 +
|-style="background-color:#fef6e7;"
 +
! colspan="4" | Malayalam
 
|-
 
|-
 
|  
 
|  
 
| [https://rachana.org.in/ RIT Rachana]  
 
| [https://rachana.org.in/ RIT Rachana]  
| good conjunct coverage; four weights
+
| an elegant font good conjunct coverage; two weights; italic style; one of the most complete fonts available
 +
| <code>rit-rachana</code>
 
|-
 
|-
 
|
 
|
 
| [https://rachana.org.in/ Panmana]
 
| [https://rachana.org.in/ Panmana]
 
| good conjunct coverage; single weight; body-text font
 
| good conjunct coverage; single weight; body-text font
 +
| <code>panmana</code>
 
|-
 
|-
 
| [https://rachana.org.in/ Ezhuthu]
 
| [https://rachana.org.in/ Ezhuthu]
 
|  
 
|  
 
| handwriting font; single weight
 
| handwriting font; single weight
 +
| <code>ezhuthu</code>
 
|-
 
|-
 
|
 
|
 
| [https://rachana.org.in/ RIT Sundar]
 
| [https://rachana.org.in/ RIT Sundar]
 
| Single weight
 
| Single weight
 +
| <code>rit-sundar<code>
 
|-
 
|-
 
| [https://rachana.org.in/ TN Joy]
 
| [https://rachana.org.in/ TN Joy]
 
|  
 
|  
 
| three weights
 
| three weights
 +
| <code>tn-joy</code>
 
|-
 
|-
 
| [https://smc.org.in/fonts/manjari/ Manjari]
 
| [https://smc.org.in/fonts/manjari/ Manjari]
 
|  
 
|  
| Elegant handwriting font; curves designed using Raph Levien's spiral library for Inkscape; suitable for body and titles; three weights
+
| curvy handwriting font; suitable for body and titles; three weights
 +
| <code>manjari</code>
 
|-
 
|-
 
| [https://smc.org.in/fonts/gayathri/ Gayathri]
 
| [https://smc.org.in/fonts/gayathri/ Gayathri]
 
|  
 
|  
 
| three weights
 
| three weights
 +
| <code>gayathri</code>
 
|-
 
|-
 
| [https://smc.org.in/fonts/anjalioldlipi Anjali Old Lipi]
 
| [https://smc.org.in/fonts/anjalioldlipi Anjali Old Lipi]
 
|
 
|
| legible font intended for body text; comprehensive font with glyphs for common Malayalam ligatures and Latin character set
+
| legible font intended for body text; glyphs for common Malayalam ligatures & Latin charset
 +
| <code>anjali-old-lipi</code>
 
|-
 
|-
 
| [https://smc.org.in/fonts/chilanka Chilanka]
 
| [https://smc.org.in/fonts/chilanka Chilanka]
 
|  
 
|  
| handwriting style font; contains most of the unique Malayalam conjuncts; glyph strokes are of uniform width with round ends avoiding sharp corners
+
| handwriting style font with most of the unique Malayalam conjuncts; uniform width glyph strikes with round ends
 +
| <code>chilanka</code>
 
|-
 
|-
 
| [https://smc.org.in/fonts/dyuthi Dyuthi]
 
| [https://smc.org.in/fonts/dyuthi Dyuthi]
 
|  
 
|  
| an ornamental typeface that supports Latin and Malayalam; Malayalam glyphs are based on popular 'bulged ended' type designs; single size – thicker than usual Malayalam fonts; suited for titling and text-h5s
+
| an ornamental typefaceLatin and Malayalam; Malayalam glyphs are based on popular 'bulged ended' type designs; single size – thicker than usual Malayalam fonts; suited for titles
 +
| <code>dyuthi</code>
 
|-
 
|-
 
| [https://smc.org.in/fonts/karumbi Karumbi]
 
| [https://smc.org.in/fonts/karumbi Karumbi]
 
|
 
|
| handwriting traditional script font; casual style; individually designed glyphs for complex conjuncts
+
| handwriting traditional script font; casual style
 +
| <code>karumbi</code>
 
|-
 
|-
 
| [https://github.com/EkType/Baloo2 Baloo Chettan 2]
 
| [https://github.com/EkType/Baloo2 Baloo Chettan 2]
 
|
 
|
 
| five weights
 
| five weights
|-
+
| <code>baloo-chettan</code>
! colspan="3" | Telugu
+
<code>baloo-chettan-semibold</code></br>
 +
<code>baloo-chettan-extrabold</code>
 +
|-style="background-color:#fef6e7;"
 +
! colspan="4" | Telugu
 
|-
 
|-
 
| [https://github.com/EkType/Baloo2 Baloo Tammudu 2]
 
| [https://github.com/EkType/Baloo2 Baloo Tammudu 2]
 
|  
 
|  
 
| five weights
 
| five weights
 +
| <code>baloo-tammudu</code>
 +
<code>baloo-tammudu-semibold</code></br>
 +
<code>baloo-tammudu-extrabold</code>
 
|-
 
|-
 
|  
 
|  
 
| [https://www.murtylibrary.com/mcli-fonts.php Murty Telugu]
 
| [https://www.murtylibrary.com/mcli-fonts.php Murty Telugu]
 
| good conjunct coverage; single-weight; consult license for terms of use  
 
| good conjunct coverage; single-weight; consult license for terms of use  
|-
+
| <code>murty-telugu</code>
! colspan="3" | Kannada
+
|-style="background-color:#fef6e7;"
|-
+
! colspan="4" | Kannada
 
|-
 
|-
 
|  
 
|  
 
| [https://www.murtylibrary.com/mcli-fonts.php Murty Kannada]
 
| [https://www.murtylibrary.com/mcli-fonts.php Murty Kannada]
 
| good conjunct coverage; single-weight; consult license for terms of use  
 
| good conjunct coverage; single-weight; consult license for terms of use  
|-
+
| <code>murty-kannada</code>
! colspan="3" | Tamil
+
|-style="background-color:#fef6e7;"
 +
! colspan="4" | Tamil
 
|-
 
|-
 
| [https://github.com/EkType/Baloo2 Baloo Thambi 2]
 
| [https://github.com/EkType/Baloo2 Baloo Thambi 2]
 
|
 
|
 
| five weights
 
| five weights
|-
+
| <code>baloo-thambi</code>
! colspan="3" | Bengali
+
|-style="background-color:#fef6e7;"
 +
! colspan="4" | Bengali
 
|-
 
|-
 
| [https://github.com/EkType/Baloo2 Baloo Da 2]
 
| [https://github.com/EkType/Baloo2 Baloo Da 2]
 
|
 
|
 
| five weights
 
| five weights
 +
| <code>baloo-da</code>
 +
<code>baloo-da-semibold</code></br>
 +
<code>baloo-da-extrabold</code>
 
|-
 
|-
 
|  
 
|  
 
| [https://www.murtylibrary.com/mcli-fonts.php Murty Bangla]
 
| [https://www.murtylibrary.com/mcli-fonts.php Murty Bangla]
 
| good conjunct coverage; single-weight; consult license for terms of use  
 
| good conjunct coverage; single-weight; consult license for terms of use  
|-
+
| <code>murty-bangla</code>
! colspan="3" | Gujarati
+
|-style="background-color:#fef6e7;"
 +
! colspan="4" | Gujarati
 
|-
 
|-
 
| [https://github.com/EkType/Baloo2 Baloo Bhai]
 
| [https://github.com/EkType/Baloo2 Baloo Bhai]
 
|
 
|
 
| five weights
 
| five weights
 +
| <code>baloo-bhai</code></br>
 +
<code>baloo-bhai-semibold</code></br>
 +
<code>baloo-bhai-extrabold</code>
 
|-
 
|-
 
|
 
|
 
| [https://fonts.google.com/noto Noto Serif Gujarati]
 
| [https://fonts.google.com/noto Noto Serif Gujarati]
|  
+
| font from Google
 +
| <code>noto-serif-gujarati</code>
 
|}
 
|}
  
= Basic Sample =
+
= Supported Scripts and Font Features =
  
A very basic sample with Indic scripts is the following:
+
One can of course use fonts not listed above. This section provides some details to this end.
  
<texcode>
+
== Font feaures ==
%\definefontfamily [kannada] [rm] [Kedage] [features=kannada-one]
 
\definefontfamily [kannada] [ss] [Tunga] [features=kannada-one]
 
  
\definetypeface [kannada] [mm] [math] [modern]
+
The OTF specification has two shaping implementations for the indic scripts: the 'older' v1 and the 'newer' v2. [https://docs.microsoft.com/en-us/typography/script-development/devanagari See here] for further details.
  
\setupbodyfont [kannada]
+
[https://docs.microsoft.com/en-us/typography/opentype/spec/scripttags Script tags from the OpenType specification] contains second versions for some Indic scripts.
  
\starttext
+
Why are those second versions available? From their own explanation:
ಇದು ಹೇಗಿದೆ? ನಾನು ಹೀಗೆ ತುಂಬ ಬರೆಯಬೇಕೆಂದು ಯೋಚಿಸುತ್ತಿದ್ದೇನೆ.
 
\stoptext
 
</texcode>
 
  
+
<blockquote>
= Supported Scripts =
+
The OpenType script tags can also correlate with a particular OpenType Layout implementation, with the result that more than one script tag may be registered for a given Unicode script (e.g. 'deva' and 'dev2').
 +
</blockquote>
  
The list of Indic scripts supported by ConTeXt MkIV and LMTX are:
+
Features ending in <code>-one</code> use the older OpenType implementation, while the ones ending in <code>-two</code> deploy the newer implementation.
  
* Devanagari
+
These are specified in ConTeXt by the following font features:
* Bengali
 
* Gujarati
 
* Gurmukhi
 
* Kannada
 
* Malayalam
 
* Oriya
 
* Tamil
 
* Telugu
 
  
In order to get the proper OpenType features, you need to select the proper feature from the following list:
+
{| cellpadding="5"
 +
! style="text-align:left;" | Script
 +
! style="text-align:left;" | OTF v1 script tag
 +
! style="text-align:left;" | OTF v2 script tag
 +
|-
 +
| Devanagari
 +
| <code>devanagari-one</code>
 +
| <code>devanagari-two</code>
 +
|-
 +
| Malayalam
 +
| <code>malayalam-one</code>
 +
| <code>malayalam-two</code>
 +
|-
 +
| Telugu
 +
| <code>telugu-one</code>
 +
| <code>telugu-two</code>
 +
|-
 +
| Kannada
 +
| <code>kannada-one</code>
 +
| <code>kannada-two</code>
 +
|-
 +
| Bengali
 +
| <code>bengali-one</code>
 +
| <code>bengali-two</code>
 +
|-
 +
|}
  
* <code>devanagari-one</code>
+
Please note that these font features also activate other font features as mandated in the OTF specification. These features can then be used to write typescripts for a font or to use the font directly in documents {{cmd|definefontfamily}} and/or {{cmd|definedfont}}.
* <code>bengali-one</code>
 
* <code>gujarati-one</code>
 
* <code>gurmukhi-one</code>
 
* <code>kannada-one</code>
 
* <code>malayalam-one</code>
 
* <code>oriya-one</code>
 
* <code>tamil-one</code>
 
* <code>telugu-one</code>
 
  
Depending on your font, you might need instead:
+
One of the common problems one might encounter with indic fonts is that of incorrect rendering of conjuncts involving the rakaar. In case any problems are encountered, one can try setting the <code>indic</code> feature (in addition to relevant <code>-one</code> or <code>-two</code> features above) appropriately as follows:
 +
<texcode>
 +
\definefontfeature
 +
    […]
 +
    […]
 +
    [indic={matra=auto,conjuncts=quit}]
 +
</texcode>
  
* <code>devanagari-two</code>
+
== Sanitizer ==
* <code>bengali-two</code>
+
Sometimes, some fonts might still have issues with certain conjuncts. To overcome this a <code>sanitizer</code> option may be used in defining font features. A goodies file accompanies this option. An example is illustrated below:
* <code>gujarati-two</code>
 
* <code>gurmukhi-two</code>
 
* <code>kannada-two</code>
 
* <code>malayalam-two</code>
 
* <code>oriya-two</code>
 
* <code>tamil-two</code>
 
* <code>telugu-two</code>
 
  
== Script Versions ==
+
<texcode>
 +
return {
 +
    name = "myfont",
 +
    version = "1.00",
 +
    comment = "Goodies that complement myfont.",
 +
    sanitizers = {
 +
        dev2rkrf  = {
 +
            mapping = {
 +
                ["के्र"] = "क्रे",
 +
                ["कै्र"] = "क्रै",
 +
                ["खे्र"] = "ख्रे",
 +
                ["खै्र"] = "ख्रै",
 +
                ["गे्र"] = "ग्रे",
 +
                ["गै्र"] = "ग्रै",
 +
                ["घे्र"] = "घ्रे",
 +
                ["घै्र"] = "घ्रै",
 +
                ["चे्र"] = "च्रे",
 +
                ["चै्र"] = "च्रै",
 +
                ["छे्र"] = "छ्रे",
 +
                ["छै्र"] = "छ्रै",
 +
                ["जे्र"] = "ज्रे",
 +
                ["जै्र"] = "ज्रै",
 +
                ["झे्र"] = "झ्रे",
 +
                ["झै्र"] = "झ्रै",
 +
                ["ञे्र"] = "ञ्रे",
 +
                ["ञै्र"] = "ञ्रै",
 +
                ["णे्र"] = "ण्रे",
 +
                ["णै्र"] = "ण्रै",
 +
                ["ते्र"] = "त्रे",
 +
                ["तै्र"] = "त्रै",
 +
                ["थे्र"] = "थ्रे",
 +
                ["थै्र"] = "थ्रै",
 +
                ["दे्र"] = "द्रे",
 +
                ["दै्र"] = "द्रै",
 +
                ["धे्र"] = "ध्रे",
 +
                ["धै्र"] = "ध्रै",
 +
                ["ने्र"] = "न्रे",
 +
                ["नै्र"] = "न्रै",
 +
                ["पे्र"] = "प्रे",
 +
                ["पै्र"] = "प्रै",
 +
                ["फे्र"] = "फ्रे",
 +
                ["फै्र"] = "फ्रै",
 +
                ["बे्र"] = "ब्रे",
 +
                ["बै्र"] = "ब्रै",
 +
                ["भे्र"] = "भ्रे",
 +
                ["भै्र"] = "भ्रै",
 +
                ["मे्र"] = "म्रे",
 +
                ["मै्र"] = "म्रै",
 +
                ["ये्र"] = "य्रे",
 +
                ["यै्र"] = "य्रै",
 +
                ["वे्र"] = "व्रे",
 +
                ["वै्र"] = "व्रै",
 +
                ["से्र"] = "स्रे",
 +
                ["सै्र"] = "स्रै",
 +
                ["शे्र"] = "श्रे",
 +
                ["शै्र"] = "श्रै",
 +
                ["षे्र"] = "ष्रे",
 +
                ["षै्र"] = "ष्रै",
 +
                ["हे्र"] = "ह्रे",
 +
                ["है्र"] = "ह्रै",
 +
            }
 +
        }
 +
    }
 +
}
 +
</texcode>
  
[https://docs.microsoft.com/en-us/typography/opentype/spec/scripttags Script tags from the OpenType specification] contains second versions for what might be some (or all [I’m afraid I don’t know]) Indic scripts.
+
Suppose that for a certain font (say <code>myfont</code>), using <code>devanagari-two</code> features,  the above listed ra + consonant + vowel forms are not rendered properly. The above goodies file is then saved as <code>myfont.lfg</code> and used while defining the features thus:
 +
<texcode>
 +
\definefontfeature
 +
    [myfontfeatures]
 +
    [devanagari-two]
 +
    [goodies=myfont.lfg,
 +
    sanitizer=dev2rkrf,
 +
    indic={movematra=auto,conjuncts=quit}]
 +
</texcode>
 +
Now, <code>myfontfeatures</code> can be used with {{cmd|definedfont}} and/or while writing typescripts for the font.
  
Why are those second versions available? From their own explanation:
+
= Script and language features =
 
 
<blockquote>
 
The OpenType script tags can also correlate with a particular OpenType Layout implementation, with the result that more than one script tag may be registered for a given Unicode script (e.g. 'deva' and 'dev2').
 
</blockquote>
 
 
 
Features ending in <code>-one</code> use the older OpenType implementation, while the ones ending in <code>-two</code> deploy the newer implementation.
 
 
 
= Hyphenation =
 
  
 
The hyphenation patterns for the following languages are included in ConTeXt:
 
The hyphenation patterns for the following languages are included in ConTeXt:
* Sanskrit <code>sa</code>
+
{|cellpadding="5"
* Hindi <code>hi</code>
+
! style="text-align:left;" | Script/Language
* Kannada <code>kn</code>
+
! style="text-align:left;" | Conversion set
* Telugu <code>te</code>
+
! style="text-align:left;" | Hyphenation
* Tamil <code>ta</code>
+
|-
* Malayalam <code>ml</code>
+
| Devanagari
* Bengali <code>bn</code>
+
| <code>devanagarinumerals</code>
* Gujarati <code>gr</code>
+
|
 
+
|-
A pattern is activated with {{cmd|language}}. The Sanskrit hyphenation patterns support hyphenation of Sanskrit written using the  Malayalam, Telugu, Kannada, Bengali and Latin with IAST.
+
| Malayalam
 +
| <code>malayalamnumerals</code>
 +
| <code>\language[ml]</code>
 +
|-
 +
| Kannada
 +
| <code>kannadanumerals</code>
 +
| <code>\language[kn]</code>
 +
|-
 +
| Telugu
 +
| <code>telugunumerals</code>
 +
| <code>\language[te]</code>
 +
|-
 +
| Bengali
 +
| <code>bengalinumerals</code>
 +
| <code>\language[bn]</code>
 +
|-
 +
| Tamil
 +
| <code>tamilnumerals</code>
 +
| <code>\language[ta]</code>
 +
|-
 +
| Gujarati
 +
| <code>gujaratinumerals</code>
 +
| <code>\language[gu]</code>
 +
|-
 +
| Gurmukhi
 +
| <code>gurmukhinumerals</code>
 +
|
 +
|-
 +
| Hindi
 +
| <code>devanagarinumerals</code>
 +
| <code>\language[hi]</code>
 +
|-
 +
| Sanskrit
 +
| <code>devanagarinumerals</code>
 +
| <code>\language[sa]</code>
 +
|}
  
= Numbers and Conversion sets =
+
A pattern is activated with {{cmd|language}}. The Sanskrit hyphenation patterns <code>sa</code> support hyphenation of Sanskrit written using the  Malayalam, Telugu, Kannada, Bengali and Latin with IAST. Conversion sets are used as values of the keys  <code>numberconversion</code>, <code>conversion</code> (wherever applicable) and with {{cmd|convertnumber}}.
The following number conversion sets are available:
 
* Devanagari <code>devanagarinumerals</code>
 
* Malayalam <code>malayalamnumerals</code>
 
* Tamil <code>tamilnumerals</code>
 
* Kannada <code>kannadanumerals</code>
 
* Telugu <code>telugunumerals</code>
 
* Bengali <code>bengalinumerals</code>
 
for use as values of the keys  <code>numberconversion</code>, <code>conversion</code> and with {{cmd|convertnumber}}
 
  
 
= Sanskrit Transliteration =
 
= Sanskrit Transliteration =
 
Transliteration of '''Sanskrit''' from IAST to Devanagari and vice-versa as well as from and to other Indic languages is available in ConTeXt. The following transliteration schemes are supported with more planned:
 
Transliteration of '''Sanskrit''' from IAST to Devanagari and vice-versa as well as from and to other Indic languages is available in ConTeXt. The following transliteration schemes are supported with more planned:
 
{|cellpadding="4"
 
{|cellpadding="4"
! style="width: 65%;" | Transliteration Scheme
+
! style="width: 65%; text-align: left;" | Transliteration Scheme
!Vector
+
! style="text-align: left;" | Vector
 
|-
 
|-
 
|Devanagari to IAST
 
|Devanagari to IAST
Line 231: Line 376:
 
|IAST to Devanagari
 
|IAST to Devanagari
 
|<code>iast to  deva</code>
 
|<code>iast to  deva</code>
 +
|-
 +
|ITrans to Devanagari
 +
|<code>itrans to deva</code>
 
|-
 
|-
 
|Devanagari to Malayalam
 
|Devanagari to Malayalam
Line 240: Line 388:
 
|Devanagari to Telugu
 
|Devanagari to Telugu
 
|<code>deva to tlgu</code>
 
|<code>deva to tlgu</code>
 +
|-
 +
|Devanagari to Gujarati
 +
|<code>deva to gujr</code>
 +
|-
 +
|Devanagari to Bengali
 +
|<code>deva to bngl</code>
 
|}
 
|}
  
Line 289: Line 443:
 
\stoptext
 
\stoptext
 
</texcode>
 
</texcode>
 +
 +
Please note that there is also {{cmd|resettransliteration}} which can be used in stream to (temporarily) prevent any transliteration.
  
 
== Exceptions ==
 
== Exceptions ==

Latest revision as of 03:47, 8 February 2022


TODO: this page is under construction (See: To-Do List)


Fonts

Fonts are complicated. Moreover, those for indic languages have to provide for rules for the formation of several complicated conjuncts. Each of these conjuncts can consist of several forms occurring simultaneously, which have to be positioned correctly relative to the base glyph. OTF documentation notwithstanding, font designers have their own interpretations of the specifications leading to a variety of implementations of the font features. Most of the available fonts are tested against Harfbuzz and/or ICU (sometimes only the former). Since ConTeXt uses its own OTF loading system, many indic fonts do not just work right away.

In early 2022, Hans made some nice improvements to the indic font system in ConTeXt. This was accompanied by some testing with various available fonts for some indic languages. As a result, there is an improved support for Indic fonts in ConTeXt. Various typescripts were then bundled into the ConTeXt distribution for easy use in documents.

The following table lists fonts that have been tested for use with ConTeXt. The list is by no means extensive: most of the fonts are relatively new and some (ubiquitous) old fonts are absent. If you find a font missing in this list and that works well in ConTeXt, please add it to the list. To use indic fonts place

\usetypescriptfile[indic]

in the document head and use \definetypeface and/or \setupbodyfont with the following typescripts:

Sans Serif Notes Typescript(s)
Devanagari
Adishila 4 different designs; many weights and styles; good conjunct coverage; IAST support adishila

adishila-semibold
adishila-heavy
adishila-dev
adishila-dev-guru
adishila-san
adishila-san-letterpress

Shobhika two weights; good conjunct coverage; IAST support; some maths support shobhika
Baloo five weights baloo

baloo-semibold
baloo-extrabold

Amita calligraphic style amita
Jaini, Jaini Purva fonts with calligraphic style commonly found in Jaina kalpasūtra manuscripts jaini

jaini-purva

Malayalam
RIT Rachana an elegant font good conjunct coverage; two weights; italic style; one of the most complete fonts available rit-rachana
Panmana good conjunct coverage; single weight; body-text font panmana
Ezhuthu handwriting font; single weight ezhuthu
RIT Sundar Single weight rit-sundar
TN Joy three weights tn-joy
Manjari curvy handwriting font; suitable for body and titles; three weights manjari
Gayathri three weights gayathri
Anjali Old Lipi legible font intended for body text; glyphs for common Malayalam ligatures & Latin charset anjali-old-lipi
Chilanka handwriting style font with most of the unique Malayalam conjuncts; uniform width glyph strikes with round ends chilanka
Dyuthi an ornamental typeface; Latin and Malayalam; Malayalam glyphs are based on popular 'bulged ended' type designs; single size – thicker than usual Malayalam fonts; suited for titles dyuthi
Karumbi handwriting traditional script font; casual style karumbi
Baloo Chettan 2 five weights baloo-chettan

baloo-chettan-semibold
baloo-chettan-extrabold

Telugu
Baloo Tammudu 2 five weights baloo-tammudu

baloo-tammudu-semibold
baloo-tammudu-extrabold

Murty Telugu good conjunct coverage; single-weight; consult license for terms of use murty-telugu
Kannada
Murty Kannada good conjunct coverage; single-weight; consult license for terms of use murty-kannada
Tamil
Baloo Thambi 2 five weights baloo-thambi
Bengali
Baloo Da 2 five weights baloo-da

baloo-da-semibold
baloo-da-extrabold

Murty Bangla good conjunct coverage; single-weight; consult license for terms of use murty-bangla
Gujarati
Baloo Bhai five weights baloo-bhai

baloo-bhai-semibold
baloo-bhai-extrabold

Noto Serif Gujarati font from Google noto-serif-gujarati

Supported Scripts and Font Features

One can of course use fonts not listed above. This section provides some details to this end.

Font feaures

The OTF specification has two shaping implementations for the indic scripts: the 'older' v1 and the 'newer' v2. See here for further details.

Script tags from the OpenType specification contains second versions for some Indic scripts.

Why are those second versions available? From their own explanation:

The OpenType script tags can also correlate with a particular OpenType Layout implementation, with the result that more than one script tag may be registered for a given Unicode script (e.g. 'deva' and 'dev2').

Features ending in -one use the older OpenType implementation, while the ones ending in -two deploy the newer implementation.

These are specified in ConTeXt by the following font features:

Script OTF v1 script tag OTF v2 script tag
Devanagari devanagari-one devanagari-two
Malayalam malayalam-one malayalam-two
Telugu telugu-one telugu-two
Kannada kannada-one kannada-two
Bengali bengali-one bengali-two

Please note that these font features also activate other font features as mandated in the OTF specification. These features can then be used to write typescripts for a font or to use the font directly in documents \definefontfamily and/or \definedfont.

One of the common problems one might encounter with indic fonts is that of incorrect rendering of conjuncts involving the rakaar. In case any problems are encountered, one can try setting the indic feature (in addition to relevant -one or -two features above) appropriately as follows:

\definefontfeature
    []
    []
    [indic={matra=auto,conjuncts=quit}]

Sanitizer

Sometimes, some fonts might still have issues with certain conjuncts. To overcome this a sanitizer option may be used in defining font features. A goodies file accompanies this option. An example is illustrated below:

return {
    name = "myfont",
    version = "1.00",
    comment = "Goodies that complement myfont.",
    sanitizers = {
        dev2rkrf  = { 
            mapping = {
                ["के्र"] = "क्रे",
                ["कै्र"] = "क्रै",
                ["खे्र"] = "ख्रे",
                ["खै्र"] = "ख्रै",
                ["गे्र"] = "ग्रे",
                ["गै्र"] = "ग्रै",
                ["घे्र"] = "घ्रे",
                ["घै्र"] = "घ्रै",
                ["चे्र"] = "च्रे",
                ["चै्र"] = "च्रै",
                ["छे्र"] = "छ्रे",
                ["छै्र"] = "छ्रै",
                ["जे्र"] = "ज्रे",
                ["जै्र"] = "ज्रै",
                ["झे्र"] = "झ्रे",
                ["झै्र"] = "झ्रै",
                ["ञे्र"] = "ञ्रे",
                ["ञै्र"] = "ञ्रै",
                ["णे्र"] = "ण्रे",
                ["णै्र"] = "ण्रै",
                ["ते्र"] = "त्रे",
                ["तै्र"] = "त्रै",
                ["थे्र"] = "थ्रे",
                ["थै्र"] = "थ्रै",
                ["दे्र"] = "द्रे",
                ["दै्र"] = "द्रै",
                ["धे्र"] = "ध्रे",
                ["धै्र"] = "ध्रै",
                ["ने्र"] = "न्रे",
                ["नै्र"] = "न्रै",
                ["पे्र"] = "प्रे",
                ["पै्र"] = "प्रै",
                ["फे्र"] = "फ्रे",
                ["फै्र"] = "फ्रै",
                ["बे्र"] = "ब्रे",
                ["बै्र"] = "ब्रै",
                ["भे्र"] = "भ्रे",
                ["भै्र"] = "भ्रै",
                ["मे्र"] = "म्रे",
                ["मै्र"] = "म्रै",
                ["ये्र"] = "य्रे",
                ["यै्र"] = "य्रै",
                ["वे्र"] = "व्रे",
                ["वै्र"] = "व्रै",
                ["से्र"] = "स्रे",
                ["सै्र"] = "स्रै",
                ["शे्र"] = "श्रे",
                ["शै्र"] = "श्रै",
                ["षे्र"] = "ष्रे",
                ["षै्र"] = "ष्रै",
                ["हे्र"] = "ह्रे",
                ["है्र"] = "ह्रै",
            }
        }
    }
}

Suppose that for a certain font (say myfont), using devanagari-two features, the above listed ra + consonant + vowel forms are not rendered properly. The above goodies file is then saved as myfont.lfg and used while defining the features thus:

\definefontfeature
    [myfontfeatures]
    [devanagari-two]
    [goodies=myfont.lfg,
     sanitizer=dev2rkrf,
     indic={movematra=auto,conjuncts=quit}]

Now, myfontfeatures can be used with \definedfont and/or while writing typescripts for the font.

Script and language features

The hyphenation patterns for the following languages are included in ConTeXt:

Script/Language Conversion set Hyphenation
Devanagari devanagarinumerals
Malayalam malayalamnumerals \language[ml]
Kannada kannadanumerals \language[kn]
Telugu telugunumerals \language[te]
Bengali bengalinumerals \language[bn]
Tamil tamilnumerals \language[ta]
Gujarati gujaratinumerals \language[gu]
Gurmukhi gurmukhinumerals
Hindi devanagarinumerals \language[hi]
Sanskrit devanagarinumerals \language[sa]

A pattern is activated with \language. The Sanskrit hyphenation patterns sa support hyphenation of Sanskrit written using the Malayalam, Telugu, Kannada, Bengali and Latin with IAST. Conversion sets are used as values of the keys numberconversion, conversion (wherever applicable) and with \convertnumber.

Sanskrit Transliteration

Transliteration of Sanskrit from IAST to Devanagari and vice-versa as well as from and to other Indic languages is available in ConTeXt. The following transliteration schemes are supported with more planned:

Transliteration Scheme Vector
Devanagari to IAST deva to iast
IAST to Devanagari iast to deva
ITrans to Devanagari itrans to deva
Devanagari to Malayalam deva to mlym
Devanagari to Kannada deva to knda
Devanagari to Telugu deva to tlgu
Devanagari to Gujarati deva to gujr
Devanagari to Bengali deva to bngl

The main macro to set up a transliteration is \definetransliteration.

A simple example

A transliteration instance is defined using \definetransliteration:

\definetransliteration
    [TrDevaToIAST]
    [vector={deva to iast}]

This defines macros for both inline and block transliteration:

\starttext

% display mode
\starttransliteration[TrDevaToIAST]
महाजनस्य संसर्गः कस्य नोन्नतिकारकः।
पद्मपत्रस्थितं तोयं धत्ते मुक्ताफलश्रियम्॥
\stoptransliteration

% or inline
The company of great people (\transliteration[TrDevaToIAST] महाजनस्य संसर्गः} is there one for whom it is not beneficial {\transliteration[TrDevaToIAST] कस्य नोन्नतिकारकः}? (No!) Situated on the leaf of the lotus, (the mere) droplet of water {\transliteration[TrDevaToIAST] पद्मपत्रस्थितं तोयं} shines forth like a pearl {\transliteration[TrDevaToIAST] धत्ते मुक्ताफलश्रियम्}.

\stoptext

Or, more conveniently:

\starttext

% display mode
\startTrDevaToIAST
दानं प्रियवाक्सहितं 
ज्ञानमगर्वं क्षमान्वितं शौर्यम्। 
रूपं शीलसुयुक्तं
दुर्लभमेतच्चतुर्भद्रम्॥
\stopTrDevaToIAST

% or inline
Charity accompanied by sweet words {\TrDevaToIAST दानं प्रियवाक्सहितं}, knowledge devoid of arrogance, valour accompanied by forgiveness (pity) {\TrDevaToIAST ज्ञानमगर्वं क्षमान्वितं शौर्यम्}, beauty accompanied by virtue (grace) {\TrDevaToIAST रूपं शीलसुयुक्तं} – these four are scarce {\TrDevaToIAST दुर्लभमेतच्चतुर्भद्रम्}.

\stoptext

Please note that there is also \resettransliteration which can be used in stream to (temporarily) prevent any transliteration.

Exceptions

Sometimes, one might want to leave retain words or phrases in the original script and avoid transliteration. To this end, \transliterationexception may be used to define such exceptions

  • at the level of a transliteration scheme, i.e., per vector as in:
\transliterationexception[deva to iast]{शरीरं}{देहं}
\transliterationexception[TrDevaToIAST]{शरीरं}{देहं}

When both are defined, the latter overrides the former. Moreover, any derived transliteration instances also inherit the exceptions defined for the parent. So, if any exceptions are to be avoided/changed, they must be redefined for the derived instances.

Source with transliterated version

Very often, one wants to typeset paragraphs in the original script accompanied by a transliterated version. This can be achieved using the before key of \definetransliteration. A simple example is as follows:

\usetransliteration[indic]
\setuplines[indenting={yes,small,even}]
\definebuffer
    [padya]
\definetransliteration
    [padyaPair]
    [color=blue,
     vector={deva to iast},
     before={\startlines\getbuffer[padya]\par},
     after=\stoplines]

\starttext
\startbuffer[padya]
कोऽतिभारः समर्थानां किं दूरं व्यवसायिनाम्।
को विदेशस्तु विदुषां कः परः प्रियवादिनाम्॥
\stopbuffer

\startpadyaPair
\getbuffer[padya]
\stoppadyaPair

What is a burden for the abled, and what is faraway (beyond reach) to the
perservering? What is a foreign land to the learned and who are strangers to
the sweet-spoken? (None!)

\stoptext

In summary: one defines a buffer which contains the paragraph (a verse in this example) in the original script. Then, using the before and after keys of \definetransliteration, a pair of verses may be easily typeset into lines.