Changes

Jump to navigation Jump to search
7,797 bytes added ,  12:20, 8 June 2020
m
make categorization match the new site organization
[[Arabic fonts]] >
 
{{todo|overview page for the use of middle-eastern scripts}}
== Examples Arabic ==
=== arabicThis is an example environment for typesetting Arabic documents in Mark IV (ConTeXt with LuaTeX). It won't work at all in Mark II (with either pdfTeX or XeTeX).tex ===
To use a specific font just put Save it under the current directory where you run context, as "ara-sty.tex" and replace ScheherazadeRegOT with use "\environment ara-sty" in your font file name. Then run 'context arabic.tex' document.
<texcode>
\startenvironment ara-sty \mainlanguage[arabic] % engine=luatex Font setup
% OpenType features needed for Arabic
\definefontfeature
[arabarabic] [mode=node,language=dflt,script=arab, init=yes,medi=yes,fina=yes,isol=yes, liga=yes,dlig=yes,rlig=yes,clig=yes, mark=yes,mkmk=yes,kern=yes,curs=yes] \starttypescript [serif] [arabic] \definefontsynonym [Arabic-Light] [name:arabtype] [features=arabic] \definefontsynonym [Arabic-Bold] [name:arabtype] [features=arabic] \definefontsynonym [Arabic-Italic] [name:arabtype] [features=arabic] \definefontsynonym [Arabic-Bold-Italic] [name:arabtype] [features=arabic]\stoptypescript
% ScheherazadeRegOT is the font file \starttypescript [serif] [arabic] [name (without extension) replace it with your preferred font ] \fontusetypescript[serif][fallback] \Arab definefontsynonym [Serif] [Arabic-Light] [features= ScheherazadeRegOT*arabarabic] \definefontsynonym [SerifItalic] [Arabic-Italic] [features=arabic] \definefontsynonym [SerifBold] [Arabic-Bold] [features=arabic] \definefontsynonym [SerifBoldItalic] [Arabic-Bold-Italic] [features=arabic]\stoptypescript
\hoffset=0ptstarttypescript [Arabic] \definetypeface [Arabic] [rm] [serif] [arabic] [default] \stoptypescript
\def\ArabicGlobalDir {\pagedir TRT\bodydir TRT\pardir TRT\textdir TRT}\def\ArabicParDir {\textdir TRT\pardir TRT}\def\ArabicTextDir {\textdir TRT}\def\LatinParDir {\textdir TLT\pardir TLT}\def\LatinTextDir {\textdir TLT}\def\LatinGlobalDir {\pagedir TLT\bodydir TLT\pardir TLT\textdir TLT} \define\setarabic {\ArabicGlobalDir% For inner paragraph control within an LR paragraph \usetypescript[Arabic]% \setupbodyfont[Arabic,20pt]}
\definestartstop
[arabictextarabicpar] [commands=\Arabic\ArabicParDir] \define[1]\RT {{\Arabic\ArabicTextDir#1}} \define\setlatin {\LatinGlobalDir% \usetypescript[lm]% \setupbodyfont[lm,20pt]} \definestartstop [latinpar] [commands=\Arabic\LatinParDir] \define[1]\LT {{\LatinTextDir#1}} \setcharactermirroring[1] \stopenvironment</texcode> === Description === Here is some description: <texcode>\mainlanguage[arabic]</texcode> Sets the main language to Arabic, so that translatable titles are translated to Arabic. <texcode>\definefontfeature [arabic] [mode=node,language=dflt,script=arab, init=yes,medi=yes,fina=yes,isol=yes, liga=yes,dlig=yes,rlig=yes,clig=yes, mark=yes,mkmk=yes,kern=yes,curs=yes]</texcode> Here we define OpenType font features needed to render Arabic properly. <texcode>\starttypescript [serif] [arabic] \definefontsynonym [Arabic-Light] [name:arabtype] [features=arabic] \definefontsynonym [Arabic-Bold] [name:arabtype] [features=arabic] \definefontsynonym [Arabic-Italic] [name:arabtype] [features=arabic] \definefontsynonym [Arabic-Bold-Italic] [name:arabtype] [features=arabic] \stoptypescript \starttypescript [serif] [arabic] [name] \usetypescript[serif][fallback] \definefontsynonym [Serif] [Arabic-Light] [features=arabic] \definefontsynonym [SerifItalic] [Arabic-Italic] [features=arabic] \definefontsynonym [SerifBold] [Arabic-Bold] [features=arabic] \definefontsynonym [SerifBoldItalic] [Arabic-Bold-Italic] [features=arabic]\stoptypescript \starttypescript [Arabic] \definetypeface [Arabic] [rm] [serif] [arabic] [default] \stoptypescript </texcode> Then, we define "Arabic" typescript, here we used a font named "arabtype". Since this font has only regular weight, we set bold and italic to use the same font. <texcode>\def\ArabicGlobalDir {\pagedir TRT\bodydir TRT\pardir TRT\textdir TRT}\def\ArabicParDir {\textdir TRT\pardir TRT}\def\ArabicTextDir {\textdir TRT%}\def\LatinParDir {\textdir TLT\pardir TLT}\def\ArabLatinTextDir {\textdir TLT}]\def\LatinGlobalDir {\pagedir TLT\bodydir TLT\pardir TLT\textdir TLT}</texcode> Here we define some directional commands to use it in the next parts.
<texcode>\defdefine\ArabicText#1setarabic {\startarabictext#1ArabicGlobalDir% \stoparabictextusetypescript[Arabic]% \setupbodyfont[Arabic,20pt]}
% For global \definestartstop [arabicpar] [commands=\Arabic script\ArabicParDir]
\defdefine[1]\ArabicDirGlobalRT {{%\pagedir TRTArabic\bodydir TRT\textdir TRT\pardir TRT %\hoffset=-8.88cmArabicTextDir#1}} % compensate for a bug in \bodydir TRT
\def\Arabic{\ArabicDirGlobal\Arab}</texcode>
% For separate Here we define "arabicpar" environment for Arabic-script paragraphsin Latin context, and "\RT" for short Arabic sentences and "\setarabic" command to set the main document direction and font to Arabic.
<texcode>\defdefine\ArabicDirParsetlatin {\textdir TRTLatinGlobalDir% \pardir TRTusetypescript[lm]% \setupbodyfont[lm,20pt]}
\definestartstop
[arablatinpar] [commands=% {\Arab% Arabic\ArabicDirPar}LatinParDir]
\showframedefine[text1]\LT {{\LatinTextDir#1}}</texcode>
The counter Latine commands, "latinpar", "\LT" and "\setlatin".
 
<texcode>
\setcharactermirroring[1]
</texcode>
 
To enable mirroring of BiDi mirrored characters, like () and []. This also enables "implicit bidi", so that you don't need to explicitly specify the direction of individual Arabic sentences inside Latin context in vise versa.
 
=== Using it ===
 
Now, lets try a "Hello World" example:
 
<texcode>
% engine=luatex
 
\environment ara-sty
 
\starttext
\setarabic
 
أهلا بالعالم!
\stoptext
</texcode>
 
== Hebrew ==
 
(Example by Rik Kabel on the mailing list, 2017-12-15)
 
Depending on the font, correct niqqud placement requires some combination (sometimes all) of the following font features: lang, ccmp, and script.
 
<texcode>
\definefontfeature [hebrew] [oldstyle] [
lang=heb,
ccmp=yes,
script=hebr,
]
\starttext
\definedfont[name:EzraSIL*hebrew at 72pt]
\setupalign[r2l]
טְרוֹפוֹתִי
\stoptext
</texcode>
\startarabThe same settings work well for Narkisim. For David_CLM you only need the script setting.
اللَّهُمَّ صَلِّ عَلَى مُحَمَّدٍ وَآلِ مُحَمَّدٍ وَ ارْزُقْنِيالْيَقِينَ وَ حُسْنَ الظَّنِّ بِكَوَ أَثْبِتْ رَجَاءَكَ فِي قَلْبِيوَ اقْطَعْ رَجَائِي عَمَّنْ سِوَاكَحَتَّى لَا أَرْجُوَ غَيْرَكَ وَ لَاأَثِقَ إِلَّا بِك‏Correction by Hans, 2017-12-16:
\stoparabThere is a font feature hebrew already predefined
<texcode>\blankdefinefontfeature [hebrewoldstyle] [oldstyle] [...]</texcode>
Here is some mixed {\em Arabic-} (\ArabicText{عربي}) andLatin-script. As if you can see, Aleph does have to set a very good job mixing{\em LR} language depends on the font (\ArabicText{يسار-يمين}often dflt is ok) and {\em RL}(\ArabicText{يمين-يسار}) texts. \ArabicText{وهنا جملة منقطعة في وسط قرينةلاتينية}. Aleph even does a great job breaking Arabicphrases across lines.
This is the predefined set:
 
<texcode>
\definefontfeature
[semitic-complete]
[mode=node,analyze=yes,language=dflt,ccmp=yes,
autoscript=position,autolanguage=position,
init=yes,medi=yes,fina=yes,isol=yes,
mark=yes,mkmk=yes,kern=yes,curs=yes,
liga=yes,dlig=yes,rlig=yes,clig=yes,calt=yes]
 
\definefontfeature
[semitic-simple]
[mode=node,analyze=yes,language=dflt,ccmp=yes,
autoscript=position,autolanguage=position,
init=yes,medi=yes,fina=yes,isol=yes,
mark=yes,mkmk=yes,kern=yes,curs=yes,
rlig=yes,calt=yes]
 
\definefontfeature
[arabic]
[semitic-complete]
[script=arab]
 
\definefontfeature
[syriac]
[arabic]
[fin2=yes,fin3=yes,med2=yes]
 
\definefontfeature
[hebrew]
[semitic-complete]
[script=hebr]
 
\definefontfeature
[simplearabic]
[semitic-simple]
[script=arab]
 
\definefontfeature
[simplehebrew]
[semitic-simple]
[script=hebr]
</texcode>
 
Update by Joey McCollum, 2020-04-30:
 
It is a known issue that Unicode normalization (a process that sorts combining marks by their Unicode combining classes to ensure that Unicode string searches and comparisons are not hindered by different orderings of these marks) reorders niqqud and other points in a way that is not typographically or linguistically intuitive. Consider the following example:
 
<texcode>
%Setup minimal font features:
\definefontfeature[minimal][default][
ccmp=yes,
script=hebr
]
%Set up the main font:
\definefontfamily[hebrew] [rm] [SBL Hebrew] [features=minimal]
\setupbodyfont[hebrew]
%Set up right-to-left alignment:
\setupalign[r2l]
\starttext
%Normalized Unicode mark order:
בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃
\stoptext
</texcode>
 
(To compare this to a non-normalized version of the same text, copy and paste the text of Genesis 1:1 from https://tanach.us/; it cannot be included in the example above because, alas, the Wiki would apply Unicode normalization to it!)
 
When a minimal set of OpenType features needed to render Hebrew points correctly is employed, this normalized sample text will fail to typeset several points. But if you copy and paste the un-normalized text from the link above into the same example, you will find that it gets typeset completely and correctly. This is because most Hebrew fonts anticipate a particular ordering of certain classes of characters in their substitution tables, and Unicode normalization reverses this ordering in some cases. This happens most often when the shin dot and sin dot (which Unicode assigns to the combining classes 24 and 25) and / or dagesh (which Unicode places in combining class 21) co-occur with vowels (which are variously assigned combining classes between 10 and 20); the fonts expect the shin / sin dot to occur first, then the dagesh, then the vowel, but Unicode normalization sorts these characters in the opposite order. This why the first letter of בְּרֵאשִׁ֖ית in the normalized text does not have its vowel typeset and the shin in הַשָּׁמַ֖יִם in the normalized text lacks both the dagesh and the vowel that it should have. The cantillation marks, meanwhile, are all placed correctly, because Unicode normalization and the font's substitution tables both order these marks after all of the other classes.
 
A number of typesetting engines, including Microsoft's Uniscribe and Xe(La)TeX, address this discrepancy by performing an on-the-fly re-sorting of the marks into a more intuitive order to ensure that they are typeset correctly. This way, the input text does not have to be reordered manually, and the Hebrew text is rendered as expected. At the time of this edit, the latest version of ConTeXt will also do this whenever the standard Hebrew featureset (i.e., features=hebrew) is enabled.
 
[[Category:Languages]]
[[Category:Fonts]]

Navigation menu