Changes

Jump to navigation Jump to search
2,632 bytes added ,  11:51, 19 June 2019
* svgstyle: maybe compression?
* cssfile: file name of ''additional'' CSS file
 
==Suppressing Presentation Forms in Arabic Script Fonts==
 
As is well known, Arabic script requires contextual analysis: Given a character, it takes a different form depending on its position within a word or other continuous string of characters. Certain pre-Unicode conventions encoded each of these different forms for use in certain ancient typesetting applications. Those encodings are preserved as part of the Arabic Presentation Forms B block in Unicode (U+FE7 - U+FEF block). But this is a legacy encoding: These codepoints should <em>never</em> be used to prepare fresh Unicode documents. Rather, Arabic script characters should be encoded primarily using codepoints from the standard Arabic block (U+600-6FF). Contextual forms of these characters are called upon during OpenType processing, but they do not take separate codepoints.
 
Unfortunately, certain Arabic fonts such as Linotype Lotus - it is a staple of the Middle East publishing industry - give the contextual forms of standard Arabic-script characters Unicode names that correspond to codepoints from Arabic Presentation Forms B. This saves space in the font, and for some purposes it is innocuous. But for, e.g., ConTeXt processing, many of the original, standard codepoints in the input are replaced by presentation-form codepoints in the output. For example:
 
<texcode>مِّنَ السَّمَاءِ وَالْأَرْضِ</texcode>
 
becomes
 
<texcode>ﻣﱢﻦَ اﻟﺴﱠﻤَﺎءِ وَاﻷَْرْضِ</texcode>
 
If one looks carefully, one will see that there are errors in the output. (If use the font almfixed in a text editor that supports Unicode and Arabic script you will easily see the differences).
 
When exporting to XML, this issue become a serious problem, for the exported text will not use the same codepoints as the input. To get around this issue, use something along the lines of the following in your preamble:
 
<texcode>
% private typescript that combines TeX-Gyre Termes for Latin with Linotype Lotus for Arabic
\usetypescriptfile[type-times-lotus]
\usetypescript[times-lotus]
 
% choose your desired features for pdf output
\definefontfeature
[lotus-default]
[mode=node,language=dflt,script=arab,
init=yes,medi=yes,fina=yes,isol=yes,
liga=no,rlig=yes,trep=yes,tlig=yes,
mark=yes,ccmp=yes]
 
% use this in export mode
 
\definefontfeature
[lotus-default]
[mode=none]
 
% setup the bodyfont last; the order is important!
\setupbodyfont[times-lotus,12pt]
</texcode>
 
So mode=none will suppress the contextual analysis, bypass the presentation-forms codepoints, and give the original input in the export.
= More TODO =
67

edits

Navigation menu