Changes

Jump to navigation Jump to search
First posted page
=Goal=

This document is for XML authors who want to use open source
software to produce high quality PDF documents--right now. The
most official way to convert XML to PDF has been to use the FO
language, but the only open source project to convert FO is FOP,
and it doesn't come close to implementing all the standards. It
cannot center tables, for example, and it has no way to control
orphan text. The FOP developers have not made any changes in the
last 1 1/2 years, making believe it is a dead end project.

ConTeXt, a variation of tex, has almost none of the limitations
one runs into when using FOP. If we know how to use ConTeXt in
place of FO, we can produce the documents we want right now.
Don't worry if you have never seen FOP. Just ignore those parts
and you should still get a good idea on how to use ConTeXt.

Since an XML author will use XSLT for conversion, we can dispense
with many of the macros written in ConTeXt that produce such
things as titles and table of contents. We'll let XSLT do a lot
of the work and use the most stipped down version of ConTeXt we
can.

This document assumes that the user already has ConTeXt installed
and knows how to use it. It also assumes that he has a passing
knowledge of XM and XSLT.

=Converting From XML=

Being text-based, ConTeXt does not lend itself well to a
conversion from an XML tree. If an innoncent blank line from an
XML document finds its way into a ConTeXt document, we end up
with an extra paragraph division. One way around this problem is
to use ConTeXt's native XML mapping, which you can read about on
the ConTeXt home page. I find this mapping scheme too
complicated, which is why I advocate using
http://getfo.sourceforge.net/texml/index.html TeXML. TeXML is a
python utility that converts its own special form of XML into
ConTeXt. That means you can use XSLT to convert from one XML tree
to another and then let the python utlity to the dirty work of
handling white space.

TeXML uses a very simple XML language. Basically, it represents
ConTeXt commands in XML and does little more. One could look at a
TeXML document and immediately know what the author meant to
express in ConTeXt. In converting an XML document such as TEI to
TeXML, one is coming as close as possible to actually conerting
to ConTeXt itself, without having to worry about white space, and
while having the comfort of still working with an XML tree. If
you use TeXML to convert, you really won't have to learn a new
XML languge, since TeXML consists of very few elements. Instead,
you will still think in terms of ConTeXt.

==Simple Document in ConTeXt and in TeXML==

Here is the simplest ConTeXt document:

<texcode>

\starttext
hello world
\stoptext

</texcode>

In TeXML, this looks like:

<pre>

<?xml version="1.0"?>
<TeXML>
<env name="text">
<TeXML>Hello World</TeXML>
</env>
</TeXML>

</pre>

Follow the instruction to convert the TeXML to ConTeXt. As of
writing this document, I had to set an environmental variable to
tell TeXML it was converting to ConTeXt (as opposed to latex),
and then type <texcode>

texml.py -e utf <indoc> <outdoc></texcode>

.

Our simple document consists of one envrionment, the text
environonment. Like all environments in ConTeXt, this one starts
with a backslash followed by the word "start", and then followed
by the name of the environment wihout a space. We end this
environment in the same way, replacing "start" with "stop."

In TeXML, we enclose environments with the <texcode>

env</texcode>

element. The mandatory "name" attribute defines the environment's
name.

==Commands==

Aside from environments, we also have commands in ConTeXt.
Through commands we control the text formatting in ConTeXt.
Commands start with a backslash and can be followed by setups,
which are placed in brackets, and by the "scope or range of the
command," which are placed in curly brackets.

For example, if we wanted to create a simple document with just
one box, inside of which were the lines "that's it," we write:

<texcode>

\starttext
\framed[width=2cm,height=1cm]{that's it}
\stoptext

</texcode>

In TeXML, this looks like:

<pre>

<?xml version="1.0"?>
<TeXML>
<env name="text">
<TeXML>
<cmd name="framed">
<opt>width=2cm, height=1cm</opt>
<parm>that's it</parm>
</TeXML>
</env>
</TeXML>

</pre>

=Other Preliminaries=

In order to makes sure that that our unicoded XML documents get
converted properly, we want to put the following line at the top
of all our documents:

<texcode>

\enableregime[utf]

</texcode>

Apparently, this allows ConTeXt to handle both utf8 and utf16.

In addtion, we want to disable as many of ConTeXt's automatic
modes as possible, since we will generate things like titles and
sections our self. ConTeXt automatically places a number on each
page. To turn this feature off, place this line somewhere at the
top of your document:

<texcode>


\setuppagenumbering[state=stop]

</texcode>

We might alter this command in some ways later.

=Example Documents=

Here are two very simple documents, one in plain old ConTeXt, and
one in TeXML.
22

edits

Navigation menu