Changes

Jump to navigation Jump to search
m
Text replacement - "</cmd>" to "}}"
This document is for XML authors who want to use open source
software to produce high quality PDF documents--right now. Themost official way to convert XML to PDF has been to use the FOlanguage, but the only open source project to convert FO is FOP,and it doesn't come close to implementing all the standards. Itcannot center tables, for example, and it has no way to controlorphan text. The FOP developers have not made any changes in thelast 1 1/2 years, making believe it is a dead end project.
ConTeXt, One and one half years ago I gave up using LaTeX to format my XMLdocuments. I had found--or so I thought--a variation of texmuch superior solutionin the Formatting Object language, has almost none or FO. FO would allow me tocreate high quality PDF documents in XML and unicode instead of the limitationscumbersome and unfamiliar syntax of TeX. It wouldone runs into when using FOPallowed me to convert from an XML tree to an XML tree, exactlywhat an XML author wants. If we know how to use ConTeXt The FO language was established intheplace same manner as HTML and therefore represented the power andacceptance of FOopen source software. The open source tool calledFOP, we can produce which did the actual conversion from the documents we want right now.abstract languageDon't worry if you have never seen FOP. Just ignore those partsto PDF, could already do much of what I wanted for my smallneeds, such as create simple tables and you should still get a good idea on how to use ConTeXtformat paragraphs.
Since an XML author will use XSLT for conversionOne and one half years passed, however, we can dispenseand nothing changed withwith many the development of the macros written in ConTeXt that FOP. While I could produce basic documents, Istill couldn't perform other basic formatting needs, suchasthings as titles and controlling widow paragraphs or centering a table of contents. We'll let XSLT do a lotSince thedevelopers of FOP have made no changes to their software in allthis time, I came to the work conclusion that I would be stuck withlimitations if I continued to use FO and use the most stipped down version of ConTeXt weFOP to convert mycandocuments.
Thus I turned back to TeX, knowing I would face almost none ofthese limitations. I could produce beautiful documents right now,without having to wait for an open source FO converter thatactually implemented all the standards. ConTeXt seemed the mostadvanced form of TeX, allowing me to format in the most directmanner without having to rely on many different macros (oroutside libraries), so I choose it. If you are an XML author who wants to convert your documents toPDF via XSLT, you will find this document useful. I try to firstdescribe how to do something in FO before explaining how I woulddo it in ConTeXt, but even if you do not know any FO you shouldfind the hints about formatting useful. ==What You Should Know== This document assumes that the user you already has have ConTeXt installedand knows know how to use it. It also assumes that he has If neither is true, take a passinglittle bitknowledge of XM time and XSLTvisit the ConTeXt website to get familiar with how torun ConTeXt on your system. At the minimum, you should know thecommands to issue to convert the ConTeXt examples here to PDF. Youdon't need to know more than that to get started, though ofcourse the more you learn, the clearer this document will be.
=Converting From XML=
Being text-based, ConTeXt does not always lend itself well to adirectconversion from an XML tree. If an innoncent innocent blank line from an
XML document finds its way into a ConTeXt document, we end up
with an extra paragraph division. One way around this problem is
python utility that converts its own special form of XML into
ConTeXt. That means you can use XSLT to convert from one XML tree
to another and then let the python utlity to utility do the dirty work ofhandling white spacewhitespace.
TeXML uses a very simple XML language. Basically, it represents
TeXML document and immediately know what the author meant to
express in ConTeXt. In converting an XML document such as TEI to
TeXML, one is coming as close as possible to actually conertingconverting
to ConTeXt itself, without having to worry about white space, and
while having the comfort of still working with an XML tree. If
you use TeXML to convert, you really won't have to learn a new
XML langugelanguage, since TeXML consists of very few elements. Instead,
you will still think in terms of ConTeXt.
==Simple Document in ConTeXt and in TeXML==
Here is the simplest ConTeXt document:
<texcode>
 
\starttext
hello world
\stoptext
</texcode>
On my system, I issue the command:
 
<texcode>
texexec [document_name]
</texcode>
to produce a formatted document. Along with many other documents,this command prodices a document with the extension ".dvi", whichI can view with the xdvi software. Follow the instructions toproduce other types of output. In TeXML, this simple document looks like:
<pre>
 
<?xml version="1.0"?>
<TeXML>
<env name="text">
<TeXML>Hello World</TeXML>
</env>
</TeXML>
 
</pre>
Follow the instruction I need to first convert the TeXML this document to ConTeXtand then issuethe same exact commands I used above. It is a two step process. As ofwriting this document, I had In order to set an environmental variable totell TeXML it was converting convert the XML to ConTeXt (as opposed to latex),and then type <texcode>I issue the command:
<pre>texml.py -e utf <indoc> <outdoc>utf8 -c [infile.xml] [outfile.tex]</texcodepre>
The "-e" option along with its argument of "utf8" tells TeXML toproduce a document that is encoded in utf8. The "-c" option tellsTeXML to produce ConTeXt output rather than LaTeX. Make sure youinclude both options.
Our simple document consists of one envrionmentAlthough ConTeXt and XML documents use differentstructures, they do share the main textenvironment. environonment. Like all environments in ConTeXt, this one starts
with a backslash followed by the word "start", and then followed
by the name of the environment wihout without a space. We end this
environment in the same way, replacing "start" with "stop."
In TeXML, we enclose environments with the <texcode>
<texcode>env</texcode>
element. The mandatory "name" attribute defines the environment's
command," which are placed in curly brackets.
For example, if we wanted to create a simple document with justone box, inside of which were the lines "that's it," we write:
<texcode>
 
\starttext
\framed[width=2cm,height=1cm]{that's it}
\stoptext
 
</texcode>
<pre>
 
<?xml version="1.0"?>
<TeXML>
<opt>width=2cm, height=1cm</opt>
<parm>that's it</parm>
}}
</TeXML>
</env>
</TeXML>
 
</pre>
<texcode>
 
\enableregime[utf]
 
</texcode>
Apparently, this allows ConTeXt to handle both utf8 and utf16.
In addtionaddition, we want to disable as many of ConTeXt's automatic
modes as possible, since we will generate things like titles and
sections our selfourselves. ConTeXt automatically places a number on eachpage, and starts a new number with each part. To turn this feature off, place this line somewhere at thetop of your document:
<texcode>
  \setuppagenumbering[state=stop, way=bytext
</texcode>
[[simple_page<code>Simple_page.tex</code> <texcode>\enableregime[utf]\setuppagenumbering[state=stop\starttextWie schön!\stoptext</texcode> and <code>Simple_page.texml</code> <pre><?xml version="1.0"?><TeXML><!--Attributes nl1 and nl2 can be used to force a new line before (nl1) or after (nl2) TeX command.--> <cmd name="setuppagenumbering"> <opt>state=stop, way=bytext</opt> }} <cmd name="enableregime" nl1="1"> <opt>utf</opt> }} <env name="text"> Wie schön! </env></TeXML></pre> =To Do= * Include more documentation about TeXML.
[[simple_page.texmlCategory:XML]]

Navigation menu