Input and compilation/Export

From Wiki
Jump to navigation Jump to search

TODO: This page is work in progress. (See: To-Do List)


ConTeXt does not only produce beautiful PDFs, but can also export to XML/HTML. This is especially useful for creating eBooks in EPUB format.

< XML | EPUB >

Minimal example

\setupbackend[export=yes] % this is all to activate export!

\starttext
\samplefile{tufte}
\stoptext

Exported structure

If you compile an example as above as minimal.tex, you get a directory structure like this:

.
├── minimal-export
│   ├── images
│   ├── minimal-div.html
│   ├── minimal-pub.lua
│   ├── minimal-raw.xml
│   ├── minimal-tag.xhtml
│   └── styles
│       ├── minimal-defaults.css
│       ├── minimal-fonts.css
│       ├── minimal-images.css
│       ├── minimal-styles.css
│       └── minimal-templates.css
├── minimal.log
├── minimal.pdf
├── minimal.tex
└── minimal.tuc

We will further refer to these files without the prefix ("minimal-"). We reformatted the code copies a bit to make them smaller and better readable.

div.html

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<!--
    input filename   : minimal
    processing date  : 2024-10-22 19:43:06+02:00
    context version  : 2024.09.25 11:53
    exporter version : 0.36
-->
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta charset="utf-8" />
  <title>minimal</title>
  <link type="text/css" rel="stylesheet" href="styles/minimal-defaults.css" />
  <link type="text/css" rel="stylesheet" href="styles/minimal-images.css" />
  <link type="text/css" rel="stylesheet" href="styles/minimal-fonts.css" />
  <link type="text/css" rel="stylesheet" href="styles/minimal-styles.css" />
</head>
<body>
  <div class="document" xmlns="http://www.pragma-ade.com/context/export">
    <div class="warning">Rendering can be suboptimal because there is no default/fallback css loaded.</div>
    <div>
      We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose,
      categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster,
      aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats.</div>
  </div>
</body>
</html>

tag.xhtml

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!--
    input filename   : minimal
    processing date  : 2024-10-22 19:43:05+02:00
    context version  : 2024.09.25 11:53
    exporter version : 0.36
-->
<?xml-stylesheet type="text/css" href="styles/minimal-defaults.css" ?>
<?xml-stylesheet type="text/css" href="styles/minimal-images.css" ?>
<?xml-stylesheet type="text/css" href="styles/minimal-fonts.css" ?>
<?xml-stylesheet type="text/css" href="styles/minimal-styles.css" ?>
<document context="2024.09.25 11:53" date="2024-10-22 19:43:05+02:00" file="minimal" href="minimal" language="en" title="minimal" version="0.36" xml:lang="en">
  We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize,
  catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline,
  summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats.
</document>

raw.xml

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!--
    input filename   : minimal
    processing date  : 2024-10-22 19:43:05+02:00
    context version  : 2024.09.25 11:53
    exporter version : 0.36
-->
<?xml-stylesheet type="text/css" href="styles/minimal-defaults.css" ?>
<?xml-stylesheet type="text/css" href="styles/minimal-images.css" ?>
<?xml-stylesheet type="text/css" href="styles/minimal-fonts.css" ?>
<?xml-stylesheet type="text/css" href="styles/minimal-styles.css" ?>
<document context="2024.09.25 11:53" date="2024-10-22 19:43:05+02:00" file="minimal" language="en" title="minimal" version="0.36" xml:lang="en">
We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats.
</document>

pub.lua

return {
 ["htmlfiles"]={ "minimal-div.html" },
 ["htmlroot"]="minimal-div.html",
 ["identifier"]="bf72902d-4e46-9e20-b9a0-3b1a14b3f352",
 ["imagefile"]="styles/minimal-images.css",
 ["imagepath"]="images",
 ["images"]={},
 ["language"]="en",
 ["metadata"]={},
 ["name"]="minimal",
 ["stylepath"]="styles",
 ["styles"]={ "minimal-defaults.css", "minimal-images.css", "minimal-fonts.css", "minimal-styles.css" },
 ["title"]="minimal",
 ["xhtmlfiles"]={ "minimal-tag.xhtml" },
 ["xmlfiles"]={ "minimal-raw.xml" },
}

Required structuring of your ConTeXt code

The export contains usable content only for content that is "well structured" in an XML sense. In our above example all text ended up in the root tag document.

That means, you need to mark everything, from markup spans over paragraphs and enumeration items to chapters and parts with \start... … \stop....

Also note that switches like \em don’t translate into output structure, you need to \definehighlight[emph][style={\em}] and use as \emph{emphasized}.

Beware: The exported XML contains the structure of the output, i.e. it lacks some information that might be in your code. E.g. \index creates an empty registerlocation anchor, while the content appears in registerentry structures where you placed your index.

More useful example

\mainlanguage[en]
\setupbackend[export=yes]

\setupinteraction[state=start,
	color=,contrastcolor=,
	% This metadata is used for the PDF
	title={My first eBook 1},
	subtitle={},
	keywords={},
	author={Hans 1}
]
\setupexport[
	hyphen=yes,
	% This metadata is used by ConTeXt’s ePub script
	% title, subtitle and author are taken from \setupinteraction, if not set
	title={My first eBook 2},
	subtitle={},
	author={Hans 2}
]
\settaggedmetadata[
	% here you can set as many metadata entries as you like, but you need to process them yourself
	title={My first eBook 3},
	author={Hans 3},
	subtitle={},
	version={\date} % TODO: doesn’t expand
]

\definehighlight[emph][style=italicface] % use \emph{something} instead of {\em something}

\starttext

\startchapter[title=Example]

\startparagraph
\input tufte
\stopparagraph

\startsection[title={A section}]

\startparagraph
\input tufte

\startitemize[packed,joinup]
  \startitem First \stopitem
  \startitem Second \stopitem
  \startitem Third \stopitem
  \startitem Fourth\stopitem
\stopitemize

\stopparagraph

\startparagraph
\input knuth
\stopparagraph

\startparagraph
\input zapf
\stopparagraph

\stopsection
\stopchapter

\startchapter[title=Quoth\footnote{by Edgar Allan Poe}]
\startlines
\quotation{Prophet!} said I, \quotation{thing of evil!—prophet still, if bird or devil!
By that Heaven that bends above us—by that God we both adore—
Tell this soul with sorrow laden if, within the distant Aidenn,
It shall clasp a sainted maiden whom the angels name Lenore—
Clasp a rare and radiant maiden whom the angels name Lenore.}
\emph{Quoth the Raven \quotation{Nevermore.}}
\stoplines
\stopchapter

\stoptext

There’s also an example of an export-friendly ConTeXt file in the sources: export-example.tex.

Choice of output files

Only after such tagging we find significant differences between the three content output files:

TODO: explain differences between export variants

raw.xml

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!--

    input filename   : useful
    processing date  : 2024-10-22 19:49:25+02:00
    context version  : 2024.09.25 11:53
    exporter version : 0.36

-->
<?xml-stylesheet type="text/css" href="styles/useful-defaults.css" ?>
<?xml-stylesheet type="text/css" href="styles/useful-images.css" ?>
<?xml-stylesheet type="text/css" href="styles/useful-fonts.css" ?>
<?xml-stylesheet type="text/css" href="styles/useful-styles.css" ?>
<document author="Hans 1" context="2024.09.25 11:53" date="2024-10-22 19:49:25+02:00" file="useful" language="en" title="My first eBook 1" version="0.36" xml:lang="en">
 <section detail="chapter" chain="chapter" implicit="1" level="2">
  <metadata>
   <metavariable name="author">Hans 3</metavariable>
   <metavariable name="title">My first eBook 3</metavariable>
   <metavariable name="version">\date </metavariable>
  </metadata>
  <sectioncaption>
   <sectionnumber>1</sectionnumber>
   <sectiontitle>Example</sectiontitle>
  </sectioncaption>
  <sectioncontent>
   <paragraph>We thrive in information--thick worlds because of our marvelous and everyday ca­pacity to select, edit, single out, structure, highlight, group, pair, merge, harmo­nize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, dis­tinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, re­fine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats. </paragraph>
   <section detail="section" chain="section" implicit="2" level="3">
    <sectioncaption>
     <sectionnumber>1.1</sectionnumber>
     <sectiontitle>A section</sectiontitle>
    </sectioncaption>
    <sectioncontent>
     <paragraph>We thrive in information--thick worlds because of our marvelous and everyday ca­pacity to select, edit, single out, structure, highlight, group, pair, merge, harmo­nize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, dis­tinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, re­fine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats. <itemgroup detail="itemize" level="1" packed="yes" symbol="1"><item><itemtag></itemtag><itemcontent>First</itemcontent></item><item><itemtag></itemtag><itemcontent>Second</itemcontent></item><item><itemtag></itemtag><itemcontent>Third</itemcontent></item><item><itemtag></itemtag><itemcontent>Fourth</itemcontent></item></itemgroup></paragraph>
     <paragraph>Thus, I came to the conclusion that the designer of a new system must not only be the implementer and first large--scale user; the designer should also write the first user manual.       <break/>
The separation of any of these four components would have hurt TEX significantly. If I had not participated fully in all these activities, literally hundreds of improve­ments would never have been made, because I would never have thought of them or perceived why they were important.       <break/>
But a system cannot be successful if it is too strongly influenced by a single person. Once the initial design is complete and fairly robust, the real test begins as people with many different viewpoints undertake their own experiments. </paragraph>
     <paragraph>Coming back to the use of typefaces in electronic publishing: many of the new ty­pographers receive their knowledge and information about the rules of typography from books, from computer magazines or the instruction manuals which they get with the purchase of a PC or software. There is not so much basic instruction, as of now, as there was in the old days, showing the differences between good and bad typographic design. Many people are just fascinated by their PC's tricks, and think that a widely--praised program, called up on the screen, will make every­thing automatic from now on. </paragraph>
    </sectioncontent>
   </section>
  </sectioncontent>
 </section>
 <section detail="chapter" chain="chapter" implicit="4" level="2">
  <sectioncaption>
   <sectionnumber>2</sectionnumber>
   <sectiontitle>Quoth<descriptionsymbol detail="footnote" insert="1"><sup>1</sup></descriptionsymbol><description detail="footnote" chain="footnote" insert="1"><descriptiontag><sup>1</sup> </descriptiontag><descriptioncontent>by Edgar Allan Poe</descriptioncontent></description></sectiontitle>
  </sectioncaption>
  <sectioncontent>
   <lines detail="lines" chain="lines">
    <line><delimited detail="quotation"><delimitedsymbol symbol="left"></delimitedsymbol><delimitedcontent>Prophet!</delimitedcontent><delimitedsymbol symbol="right"></delimitedsymbol></delimited> said I, <delimited detail="quotation"><delimitedsymbol symbol="left"></delimitedsymbol><delimitedcontent>thing of evil!—prophet still, if bird or devil!</delimitedcontent><line>By that Heaven that bends above us—by that God we both adore— </line><line>Tell this soul with sorrow laden if, within the distant Aidenn, </line><line>It shall clasp a sainted maiden whom the angels name Lenore— </line><line>Clasp a rare and radiant maiden whom the angels name Lenore.     <break/>
 </line><delimitedsymbol symbol="right"></delimitedsymbol></delimited> </line>
    <line><highlight detail="emph">Quoth the Raven <delimited detail="quotation"><delimitedsymbol symbol="left"></delimitedsymbol><delimitedcontent>Nevermore.</delimitedcontent><delimitedsymbol symbol="right"></delimitedsymbol></delimited></highlight> </line>
   </lines>
  </sectioncontent>
 </section>
</document>
<break/>

tag.xhtml

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!--
    input filename   : useful
    processing date  : 2024-10-22 19:49:25+02:00
    context version  : 2024.09.25 11:53
    exporter version : 0.36
-->
<?xml-stylesheet type="text/css" href="styles/useful-defaults.css" ?>
<?xml-stylesheet type="text/css" href="styles/useful-images.css" ?>
<?xml-stylesheet type="text/css" href="styles/useful-fonts.css" ?>
<?xml-stylesheet type="text/css" href="styles/useful-styles.css" ?>
<document author="Hans 1" context="2024.09.25 11:53" date="2024-10-22 19:49:25+02:00" file="useful" href="useful" language="en" title="My first eBook 1" version="0.36" xml:lang="en">
 <section chain="chapter" detail="chapter" id="aut:1" implicit="1" level="2">
  <metadata>
   <metavariable name="author">Hans 3</metavariable>
   <metavariable name="title">My first eBook 3</metavariable>
   <metavariable name="version">\date </metavariable>
  </metadata>
  <sectioncaption>
   <sectionnumber>1</sectionnumber>
   <sectiontitle>Example</sectiontitle>
  </sectioncaption>
  <sectioncontent>
   <paragraph>We thrive in information--thick worlds because of our marvelous and everyday ca­pacity to select, edit, single out, structure, highlight, group, pair, merge, harmo­nize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, dis­tinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, re­fine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats. </paragraph>
   <section chain="section" detail="section" id="aut:2" implicit="2" level="3">
    <sectioncaption>
     <sectionnumber>1.1</sectionnumber>
     <sectiontitle>A section</sectiontitle>
    </sectioncaption>
    <sectioncontent>
     <paragraph>We thrive in information--thick worlds because of our marvelous and everyday ca­pacity to select, edit, single out, structure, highlight, group, pair, merge, harmo­nize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, dis­tinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, re­fine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats. <itemgroup detail="itemize" level="1" packed="yes" symbol="1"><item><itemtag></itemtag><itemcontent>First</itemcontent></item><item><itemtag></itemtag><itemcontent>Second</itemcontent></item><item><itemtag></itemtag><itemcontent>Third</itemcontent></item><item><itemtag></itemtag><itemcontent>Fourth</itemcontent></item></itemgroup></paragraph>
     <paragraph>Thus, I came to the conclusion that the designer of a new system must not only be the implementer and first large--scale user; the designer should also write the first user manual.       <break/>
The separation of any of these four components would have hurt TEX significantly. If I had not participated fully in all these activities, literally hundreds of improve­ments would never have been made, because I would never have thought of them or perceived why they were important.       <break/>
But a system cannot be successful if it is too strongly influenced by a single person. Once the initial design is complete and fairly robust, the real test begins as people with many different viewpoints undertake their own experiments. </paragraph>
     <paragraph>Coming back to the use of typefaces in electronic publishing: many of the new ty­pographers receive their knowledge and information about the rules of typography from books, from computer magazines or the instruction manuals which they get with the purchase of a PC or software. There is not so much basic instruction, as of now, as there was in the old days, showing the differences between good and bad typographic design. Many people are just fascinated by their PCapos;s tricks, and think that a widely--praised program, called up on the screen, will make every­thing automatic from now on. </paragraph>
    </sectioncontent>
   </section>
  </sectioncontent>
 </section>
 <section chain="chapter" detail="chapter" id="aut:4" implicit="4" level="2">
  <sectioncaption>
   <sectionnumber>2</sectionnumber>
   <sectiontitle>Quoth<descriptionsymbol detail="footnote" insert="1"><sup>1</sup></descriptionsymbol><description chain="footnote" detail="footnote" insert="1"><descriptiontag><sup>1</sup> </descriptiontag><descriptioncontent>by Edgar Allan Poe</descriptioncontent></description></sectiontitle>
  </sectioncaption>
  <sectioncontent>
   <lines chain="lines" detail="lines">
    <line><delimited detail="quotation"><delimitedsymbol symbol="left"></delimitedsymbol><delimitedcontent>Prophet!</delimitedcontent><delimitedsymbol symbol="right"></delimitedsymbol></delimited> said I, <delimited detail="quotation"><delimitedsymbol symbol="left"></delimitedsymbol><delimitedcontent>thing of evil!—prophet still, if bird or devil!</delimitedcontent><line>By that Heaven that bends above us—by that God we both adore— </line><line>Tell this soul with sorrow laden if, within the distant Aidenn, </line><line>It shall clasp a sainted maiden whom the angels name Lenore— </line><line>Clasp a rare and radiant maiden whom the angels name Lenore.     <break/>
 </line><delimitedsymbol symbol="right"></delimitedsymbol></delimited> </line>
    <line><highlight detail="emph">Quoth the Raven <delimited detail="quotation"><delimitedsymbol symbol="left"></delimitedsymbol><delimitedcontent>Nevermore.</delimitedcontent><delimitedsymbol symbol="right"></delimitedsymbol></delimited></highlight> </line>
   </lines>
  </sectioncontent>
 </section>
</document>

div.html

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<!--
    input filename   : useful
    processing date  : 2024-10-22 19:49:26+02:00
    context version  : 2024.09.25 11:53
    exporter version : 0.36
-->
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta charset="utf-8" />
  <title>My first eBook 2</title>
  <link type="text/css" rel="stylesheet" href="styles/useful-defaults.css" />
  <link type="text/css" rel="stylesheet" href="styles/useful-images.css" />
  <link type="text/css" rel="stylesheet" href="styles/useful-fonts.css" />
  <link type="text/css" rel="stylesheet" href="styles/useful-styles.css" />
</head>
<body>
  <div class="document" xmlns="http://www.pragma-ade.com/context/export">
    <div class="warning">Rendering can be suboptimal because there is no default/fallback css loaded.</div>
    <div>
      <div class="section chapter level-2" id="aut-1">
        <div class="metadata">
          <div class="metavariable metaname-author" label="author">Hans 3</div>
          <div class="metavariable metaname-title" label="title">My first eBook 3</div>
          <div class="metavariable metaname-version" label="version">\date </div>
        </div>
        <div class="sectioncaption">
          <div class="sectionnumber">1</div>
          <div class="sectiontitle">Example</div>
        </div>
        <div class="sectioncontent">
          <div class="paragraph">We thrive in information--thick worlds because of our marvelous and everyday ca­pacity to select, edit, single out, structure, highlight, group, pair, merge, harmo­nize, synthesize, focus, organize, condense, reduce,
            boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, dis­tinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average,
            approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, re­fine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the
            goats. </div>
          <div class="section level-3" id="aut-2">
            <div class="sectioncaption">
              <div class="sectionnumber">1.1</div>
              <div class="sectiontitle">A section</div>
            </div>
            <div class="sectioncontent">
              <div class="paragraph">We thrive in information--thick worlds because of our marvelous and everyday ca­pacity to select, edit, single out, structure, highlight, group, pair, merge, harmo­nize, synthesize, focus, organize, condense,
                reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, dis­tinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth,
                chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, re­fine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate
                the sheep from the goats. <div class="itemgroup itemize level-1 packed-yes symbol-1">
                  <div class="item">
                    <div class="itemtag"></div>
                    <div class="itemcontent">First</div>
                  </div>
                  <div class="item">
                    <div class="itemtag"></div>
                    <div class="itemcontent">Second</div>
                  </div>
                  <div class="item">
                    <div class="itemtag"></div>
                    <div class="itemcontent">Third</div>
                  </div>
                  <div class="item">
                    <div class="itemtag"></div>
                    <div class="itemcontent">Fourth</div>
                  </div>
                </div>
              </div>
              <div class="paragraph">Thus, I came to the conclusion that the designer of a new system must not only be the implementer and first large--scale user; the designer should also write the first user manual. <div class="break"><!--empty-->

                </div>
                The separation of any of these four components would have hurt TEX significantly. If I had not participated fully in all these activities, literally hundreds of improve­ments would never have been made, because I would never have
                thought of them or perceived why they were important. <div class="break"><!--empty-->
</div>
                But a system cannot be successful if it is too strongly influenced by a single person. Once the initial design is complete and fairly robust, the real test begins as people with many different viewpoints undertake their own
                experiments. </div>
              <div class="paragraph">Coming back to the use of typefaces in electronic publishing: many of the new ty­pographers receive their knowledge and information about the rules of typography from books, from computer magazines or the
                instruction manuals which they get with the purchase of a PC or software. There is not so much basic instruction, as of now, as there was in the old days, showing the differences between good and bad typographic design. Many people
                are just fascinated by their PCapos;s tricks, and think that a widely--praised program, called up on the screen, will make every­thing automatic from now on. </div>
            </div>
          </div>
        </div>
      </div>
      <div class="section chapter level-2" id="aut-4">
        <div class="sectioncaption">
          <div class="sectionnumber">2</div>
          <div class="sectiontitle">Quoth<div class="descriptionsymbol footnote insert-1">
              <div class="sup">1</div>
            </div>
            <div class="description footnote insert-1">
              <div class="descriptiontag">
                <div class="sup">1</div>
              </div>
              <div class="descriptioncontent">by Edgar Allan Poe</div>
            </div>
          </div>
        </div>
        <div class="sectioncontent">
          <div class="lines">
            <div class="line">
              <div class="delimited quotation">
                <div class="delimitedsymbol symbol-left"></div>
                <div class="delimitedcontent">Prophet!</div>
                <div class="delimitedsymbol symbol-right"></div>
              </div> said I, <div class="delimited quotation">
                <div class="delimitedsymbol symbol-left"></div>
                <div class="delimitedcontent">thing of evil!—prophet still, if bird or devil!</div>
                <div class="line">By that Heaven that bends above us—by that God we both adore— </div>
                <div class="line">Tell this soul with sorrow laden if, within the distant Aidenn, </div>
                <div class="line">It shall clasp a sainted maiden whom the angels name Lenore— </div>
                <div class="line">Clasp a rare and radiant maiden whom the angels name Lenore. <div class="break"><!--empty-->
</div>
                </div>
                <div class="delimitedsymbol symbol-right"></div>
              </div>
            </div>
            <div class="line">
              <div class="highlight emph">Quoth the Raven <div class="delimited quotation">
                  <div class="delimitedsymbol symbol-left"></div>
                  <div class="delimitedcontent">Nevermore.</div>
                  <div class="delimitedsymbol symbol-right"></div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </div>
  </div>
</body>
</html>

pub.lua

return {
 ["author"]="{Hans 2} ",
 ["htmlfiles"]={ "useful-div.html" },
 ["htmlroot"]="useful-div.html",
 ["identifier"]="2958b30c-47d6-9c0f-0f2d-357f8f094e98",
 ["imagefile"]="styles/useful-images.css",
 ["imagepath"]="images",
 ["images"]={},
 ["language"]="en",
 ["metadata"]={},
 ["name"]="useful",
 ["stylepath"]="styles",
 ["styles"]={ "useful-defaults.css", "useful-images.css", "useful-fonts.css", "useful-styles.css" },
 ["title"]="My first eBook 2",
 ["xhtmlfiles"]={ "useful-tag.xhtml" },
 ["xmlfiles"]={ "useful-raw.xml" },
}

(WORK IN PROGRESS)

Export options

From back-exp.mkiv, see \setupexport:

\setupexport[
   align=\raggedstatus,
   bodyfont=\bodyfontsize,
   width=\textwidth,
   title={}, % from interaction
   subtitle={}, % from interaction
   author={}, % from interaction
   firstpage=, % imagename
   lastpage=,  % imagename
   alternative=, % html or div
   properties=no,
   hyphen=no,
   svgstyle=,
   cssfile=,
]
  • The options align, bodyfont and width end up in the exported CSS.
  • title, subtitle and author default to those from \setupinteraction
  • firstpage, lastpage: cover image (end up only in pub.lua and is handled by ePub script)
  • hyphen: yes/no; include invisible hyphenation characters (soft hyphen, &shy;) at every possible place
  • alternative: html or div (influence on html export style? where?)
  • properties: no: ignore, yes: as attribute, otherwise: use as prefix (used where?)
  • svgstyle: maybe compression?
  • cssfile: file name of additional CSS file

Solutions

Missing Hyphens

Problem: If you setup export, hyphen signs go missing.

Reason: Your font is missing a soft hyphen glyph at x00A0.

Workaround:

   \enabledirectives[otf.checksofthyphen]

In 2024, you usually don’t need hyphenation in HTML files, since the browsers can hyphenate.


Hyphens replaced by soft hyphens

Problem: Hyphens in hyphenated words such as “co-worker” may be invisible in the exported file.

Reason: The “minus” symbol 0x002D is converted to a soft hyphen (0x00A0) in the middle of a word, which is generally not rendered by web browsers unless they break the line there.

Solution: Use \setupbackend[export=yes,hyphen=no]

Alternatively, you can use the UTF-8 hyphen symbol (0x2010).

Example: Replace

\setupbackend[export=yes]
\starttext
co-worker
\stoptext

with

\setupbackend[export=yes]
\starttext
co‐worker
\stoptext


Suppressing Presentation Forms in Arabic Script Fonts

As is well known, Arabic script requires contextual analysis: Given a character, it takes a different form depending on its position within a word or other continuous string of characters. Certain pre-Unicode conventions encoded each of these different forms for use in certain ancient typesetting applications. Those encodings are preserved as part of the Arabic Presentation Forms B block in Unicode (U+FE7 - U+FEF block). But this is a legacy encoding: These codepoints should never be used to prepare fresh Unicode documents. Rather, Arabic script characters should be encoded primarily using codepoints from the standard Arabic block (U+600-6FF). Contextual forms of these characters are called upon during OpenType processing, but they do not take separate codepoints.

Unfortunately, certain Arabic fonts such as Linotype Lotus – it is a staple of the Middle East publishing industry – give the contextual forms of standard Arabic-script characters Unicode names that correspond to codepoints from Arabic Presentation Forms B. This saves space in the font, and for some purposes it is innocuous. But for, e.g., ConTeXt processing, many of the original, standard codepoints in the input are replaced by presentation-form codepoints in the output. For example:

مِّنَ السَّمَاءِ وَالْأَرْضِ

becomes

ﻣﱢﻦَ اﻟﺴﱠﻤَﺎءِ وَاﻷَْرْضِ

If one looks carefully, one will see that there are errors in the output. (If one uses the font almfixed in a text editor that supports Unicode and Arabic script you will easily see the differences).

When exporting to XML, this issue become a serious problem, for the exported text will not use the same codepoints as the input. To get around this issue, use something along the lines of the following in your preamble:

% private typescript that combines TeX Gyre Termes for Latin with Linotype Lotus for Arabic
\usetypescriptfile[type-times-lotus]
\usetypescript[times-lotus]

% choose your desired features for pdf output
\definefontfeature
   [lotus-default]
   [mode=node,language=dflt,script=arab,
    init=yes,medi=yes,fina=yes,isol=yes,
    liga=no,rlig=yes,trep=yes,tlig=yes, 
    mark=yes,ccmp=yes]

% use this in export mode

\definefontfeature
   [lotus-default]
   [mode=none]

% setup the bodyfont last; the order is important!
   
\setupbodyfont[times-lotus,12pt]

So mode=none will suppress the contextual analysis, bypass the presentation-forms codepoints, and give the original input in the exported XML.

Missing Images

MetaPost images can be exported by wrapping them in \startimage and \stopimage, then running

mtxrun --script epub --images <tex_file>
context <tex_file>

For instance, the following code

\startimage
    \startMPcode
        fill fullcircle scaled 1cm withcolor blue ;
    \stopMPcode
\stopimage

will export an SVG file with a blue disc, which a web browser can load alongside the html or xhtml file. This works with other content too. For instance,

\startimage
    \startformula
        e = \frac{m_0 c^2}{\sqrt{1 - v^2 / c^2}}
    \stopformula
\stopimage

generates an SVG file with the well-known equation.

Note: With ConTeXt LMTX 2024.09.25, using \startimage and \stopimage seems un-necessary for block equations.

Missing Background or other elements in the PDF

Some features of ConTeXt are disabled when export is activated, which may lead to missing elements, for instance page backgrounds. For instance, with ConTeXt LMTX 2024.11.01, compiling the following code

\setupbackend[export=yes]
\setupbackgrounds [page] [background=color,backgroundcolor=blue]
\starttext
test
\stoptext

gives a PDF with white (instead of blue) page color. One possible workaround is to use a mode:

\startmode[export]
\setupbackend[export=yes]
\stopmode
\setupbackgrounds [page] [background=color,backgroundcolor=blue]
\starttext
test
\stoptext

The line \setupbackend[export=yes] can then be toggled on (for generating the exported files) or off (for generating the correct PDF) by compiling with or without the flag --mode=export.

Note: This workaround deals with the missing background color in the pdf output, not in the exported files. To add a background color to the latter, you can add a custom .css file using \setupexport, as \setupexport[cssfile=...].

More TODO

  • handling of images
  • which files get overwritten, which stay

Open Issues

as of 2014-01-20, updated 2015-08-08/-15, 2017-11-11, and 2024-10-22

  • <break/> in raw.xml after the root happens sometimes if something’s not tagged. (added 2024)
  • Metadata is sometimes missing, if it’s set in an environment and not in the main file. (added 2024)
  • FIXED Structure bug: Metadata ends up within the first section instead of in front of everything
  • FIXED Names of metavariables are missing in div.xhtml (could be better, but not that important)
  • Notes (footnotes): Only visual formatting, no semantical markup and no reference/ID
  • FIXED Delimited: Quotations have tagging and quotation marks, even in raw.xml (now marks have their own tags)
  • Firstpage/Lastpage (cover setup) is ignored in project structure
  • Minimal example doesn’t create a cover at all
  • \color leaves no trace in export.
  • Spacing characters like \, get lost.
  • Additional commas in export of register ranges (1,–,5 instead of 1–5).
  • There’s no marking of page breaks (ePub should have page break markers of the PDF for scientific quotability).