Chinese Japanese and Korean

From Wiki
Revision as of 21:40, 22 June 2007 by (talk) (typo)
Jump to navigation Jump to search

< Fonts | Encodings and Regimes >

Chinese in ConTeXt (ConTeXt 2005.12.19 and newer)

If you have Context 2005.12.19, you only have to get the fonts.

  1. You need some Chinese (TrueType) fonts; you may want to get FangSong, HeiTi, KaiTi and SongTi. Put those e.g. into $TEXMF/fonts/truetype/chinese/.
  2. Use Hans Hagen's experimental ttf2uni.rb script to create .map, .tmf and .enc files. You can then put the files e.g. to $TEXMF/fonts/tfm/chinese/ (*.tmf files), $TEXMF/fonts/enc/chinese/ (*.enc files, they are basically the same for all fonts) and to $TEXMF/fonts/map/chinese/.
  3. You may now need to update the hash TeX uses to find the files; using teTeX this is done by running texhash.
  4. How you can run your Hello World program:

If you only want to access a few Chinese characters, you should use \input font-chi.tex instead of \usemodule[chinese] as the latter changes also the default language and some of the numberation/section settings (see s-chi-00.tex).

If you want to typset vertical text, use \startvertical ... \stopvertical, if you want to use Chinese numbers, you can use e.g. \startitemize[c]; possible options are c or cn for normal Chinese numbers (一, 二, 三, 四, 五, 六 etc.), cc for the capitalized (or financial) Chinese numbers (壹, 贰, 叁 etc.), ec for an extended version which uses 廿 and 卅 (instead of 二十 and 三十), and ac for using the Chinese numbers zero (零, 〇) to nine (九) in the same way one does with the Arabic digits 0 to 9.

//added by Xiao Jianfeng

As far as I know, it is wrong to use "零" with "一,二,....,十". Following is the corresponding relationships between lower case and upper case Chinese numbers and arabic numbers.

Chinese lower:〇,一,二,三,四,五,六,七,八,九,十,百,千

Chinese upper:零,壹,贰,叁,肆,伍,陆,柒,捌,玖,拾,佰,仟

Arabic :0, 1, 2,3, 4,5, 6, 7,8, 9,10,100,1000

"零" is a upper case Chinese number, so it should not be mixed with other lower case Chinese numbers. Although in China, it is sometimes wrongly used.

The reason why numbers in Chinese has lower case and upper case in Chinese is for accounting safety. Lower case numbers are simple to write and far more often used in daily life, while upper case numbers are almost exclusively used in accouting.

We can see that every upper case Chinese number are very different from the others, hence cannot be easily modified to the other . But the lower case Chinese number or Arabic numbers are sometime easily to be modified. For example, "一", "二" and "三" are similar so one can easily modify a "一" to "二" or "三".And one can also modify "1" to "7" or "11", or one can modify "6" to "8".

In China, numbers must be written in both Chinese upper case and Arabic form together in accounting.

Chinese in ConTeXt (before 2005.12.19)

Xiao Jianfeng wrote in a mail to the mailing list on 2005-06-06:

Here is my way of Chinese setup in ConTeXt. I hope this can be of any help to some newbies like me who have problems in processing Chinese.

  1. Get the truetype fonts htfs.ttf, hthei.ttf, htkai.ttf and htsong.ttf from
  2. Get corresponding tfm files,, and from
  3. Get the enc file from
  4. Get the map file from
  5. Put the ttf font files you got in step 1 to texmf-fonts/fonts/truetype/chinese
  6. Unzip the files you got in step 2 and you get four corresponding directories (which contain tfm files), then put them in texmf-fonts/fonts/tfm/chinese
  7. Unzip, you will get a directory named Gbk which contains many enc files. Put the directory to texmf-fonts/fonts/enc/chinese
  8. Unzip, you will get many map files, you need just the You need to edit, delete entries of gbli* at the end of the file (lines 505-629). Then, put the modified to texmf-fonts/fonts/map/chinese. Note that newer pdfetex don't read pdftex.cfg so better use \loadmapfile[gbk] in your document.
  9. Your document should be compilable now. See sample below.
  10. I haven't tried to compile Traditional Chinese documents. Maybe just get corresponding files for Traditional Chinese and put there to the right location will work. I'm not sure.

Sample Code (save in cp936 encoding):


If you want to use UTF-8, the script by Lutz Haseloff might of interest to you; the needed perl module Encode::HanConvert is available at CPAN. Note, however, that you may only use characters representable in gbk, German umlauts for instance are converted into ??.