Export
TODO: This page is work in progress. (See: To-Do List) |
ConTeXt does not only produce beautiful PDFs, but can also export to XML/HTML. This is especially useful for creating eBooks in ePub format.
Contents
Minimal example
% mode=mkiv \setupbackend[export=yes] % this is all to activate export! \starttext \input tufte \stoptext
Exported structure
If you compile an example as above as minimal.tex
, you get a directory structure like this:
minimal.tex minimal.log minimal.pdf minimal.tuc minimal-export ├── cover.xhtml ├── images ├── minimal-div.xhtml ├── minimal-pub.lua ├── minimal-raw.xml ├── minimal-tag.xhtml └── styles ├── minimal-defaults.css ├── minimal-images.css ├── minimal-styles.css └── minimal-templates.css
We will further refer to these files without the prefix ("minimal-"). We reformatted the code copies a bit to make them smaller and better readable.
div.xhtml
<?xml version="1.0" encoding="UTF-8" standalone="no" ?> <!-- input filename : minimal processing date : Sat Jan 17 17:43:58 2015 context version : 2014.12.29 10:01 exporter version : 0.33 --> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:math="http://www.w3.org/1998/Math/MathML"> <head> <meta charset="utf-8"/> <title></title> <link type="text/css" rel="stylesheet" href="styles/minimal-defaults.css" /> <link type="text/css" rel="stylesheet" href="styles/minimal-images.css" /> <link type="text/css" rel="stylesheet" href="styles/minimal-styles.css" /> </head> <body> <div class="warning">Rendering can be suboptimal because there is no default/fallback css loaded.</div> <div> We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats. </div> </body> </html>
tag.xhtml
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!-- input filename : minimal processing date : Sat Jan 17 17:43:58 2015 context version : 2014.12.29 10:01 exporter version : 0.33 --> <?xml-stylesheet type="text/css" href="styles/minimal-defaults.css" ?> <?xml-stylesheet type="text/css" href="styles/minimal-images.css" ?> <?xml-stylesheet type="text/css" href="styles/minimal-styles.css" ?> <document href="minimal" language="en" date="Sat Jan 17 17:43:58 2015" context="2014.12.29 10:01" xmlns:m="http://www.w3.org/1998/Math/MathML" file="minimal" version="0.33"> We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats. </document>
raw.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!-- input filename : minimal processing date : Sat Jan 17 17:43:58 2015 context version : 2014.12.29 10:01 exporter version : 0.33 --> <?xml-stylesheet type="text/css" href="styles/minimal-defaults.css" ?> <?xml-stylesheet type="text/css" href="styles/minimal-images.css" ?> <?xml-stylesheet type="text/css" href="styles/minimal-styles.css" ?> <document language="en" date="Sat Jan 17 17:43:58 2015" context="2014.12.29 10:01" xmlns:m="http://www.w3.org/1998/Math/MathML" file="minimal" version="0.33"> We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats. </document>
pub.lua
return { ["htmlfiles"]={ "minimal-div.xhtml" }, ["htmlroot"]="minimal-div.xhtml", ["identifier"]="3ce74458-4cdd-829d-ace4-cf535fb00519", ["imagefile"]="styles/minimal-images.css", ["imagepath"]="images", ["images"]={}, ["language"]="en", ["name"]="minimal", ["stylepath"]="styles", ["styles"]={ "minimal-defaults.css", "minimal-images.css", "minimal-styles.css" }, ["xhtmlfiles"]={ "minimal-tag.xhtml" }, ["xmlfiles"]={ "minimal-raw.xml" }, }
Required structuring of your ConTeXt code
The export contains usable content only for content that is "well structured" in an XML sense. In our above example all text ended up in the root tag document
.
That means, you need to mark everything, from markup spans over paragraphs and enumeration items to chapters and parts with \start... … \stop...
.
Also note that switches like \em don’t translate into output structure, you need to \definehighlight[emph][style={\em}] and use as \emph{emphasized}.
Beware: The exported XML contains the structure of the output, i.e. it lacks some information that might be in your code. E.g. \index creates an empty registerlocation anchor, while the content appears in registerentry structures where you placed your index.
More useful example
% mode=mkiv \mainlanguage[en] \setupbackend[export=yes] \setupinteraction[state=start, color=,contrastcolor=, % This metadata is used for the PDF title={My first eBook 1}, subtitle={}, keywords={}, author={Hans 1} ] \setupexport[ hyphen=yes, % This metadata is used by ConTeXt’s ePub script % title, subtitle and author are taken from \setupinteraction, if not set title={My first eBook 2}, subtitle={}, author={Hans 2} ] \settaggedmetadata[ % here you can set as many metadata entries as you like, but you need to process them yourself title={My first eBook 3}, author={Hans 3}, subtitle={}, version={\date} % TODO: doesn’t expand ] \definehighlight[emph][style=italicface] % use \emph{something} instead of {\em something} \starttext \startchapter[title=Example] \startparagraph \input tufte \stopparagraph \startsection[title={A section}] \startparagraph \input tufte \startitemize[packed,joinup] \startitem First \stopitem \startitem Second \stopitem \startitem Third \stopitem \startitem Fourth\stopitem \stopitemize \stopparagraph \startparagraph \input knuth \stopparagraph \startparagraph \input zapf \stopparagraph \stopsection \stopchapter \startchapter[title=Quoth\footnote{by Edgar Allan Poe}] \startlines \quotation{Prophet!} said I, \quotation{thing of evil!—prophet still, if bird or devil! By that Heaven that bends above us—by that God we both adore— Tell this soul with sorrow laden if, within the distant Aidenn, It shall clasp a sainted maiden whom the angels name Lenore— Clasp a rare and radiant maiden whom the angels name Lenore.} \emph{Quoth the Raven \quotation{Nevermore.}} \stoplines \stopchapter \stoptext
There’s also an example of an export-friendly ConTeXt file in the sources: export-example.tex.
Choice of output files
Only after such tagging we find significant differences between the three content output files:
TODO: explain differences between export variants
raw.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!-- input filename : minimal processing date : Sat Jan 17 19:42:37 2015 context version : 2014.12.29 10:01 exporter version : 0.33 --> <?xml-stylesheet type="text/css" href="styles/minimal-defaults.css" ?> <?xml-stylesheet type="text/css" href="styles/minimal-images.css" ?> <?xml-stylesheet type="text/css" href="styles/minimal-styles.css" ?> <document xmlns:m="http://www.w3.org/1998/Math/MathML" title="My first eBook 1" version="0.33" author="{Hans 1} " context="2014.12.29 10:01" date="Sat Jan 17 19:42:37 2015" language="en" file="minimal"> <section detail="chapter" chain="chapter" level="2"> <metadata> <metavariable name="author">Hans 3</metavariable> <metavariable name="subtitle"></metavariable> <metavariable name="title">My first eBook 3</metavariable> <metavariable name="version">\date </metavariable> </metadata> <sectionnumber>1</sectionnumber> <sectiontitle>Example</sectiontitle> <sectioncontent> <paragraph>We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats.</paragraph> <section detail="section" chain="section" level="3"> <sectionnumber>1.1</sectionnumber> <sectiontitle>A section</sectiontitle> <sectioncontent> <paragraph>We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats. <itemgroup detail="itemize" chain="itemize" packed="yes" symbol="1" level="1"><item><itemtag><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></itemtag><itemcontent>First</itemcontent></item><item><itemtag><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></itemtag><itemcontent>Second</itemcontent></item><item><itemtag><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></itemtag><itemcontent>Third</itemcontent></item><item><itemtag><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></itemtag><itemcontent>Fourth</itemcontent></item></itemgroup> </paragraph> <paragraph>Thus, I came to the conclusion that the designer of a new system must not only be the implementer and first large--scale user; the designer should also write the first user manual. <break/> The separation of any of these four components would have hurt TEX significantly. If I had not participated fully in all these activities, literally hundreds of improvements would never have been made, because I would never have thought of them or perceived why they were important. <break/> But a system cannot be successful if it is too strongly influenced by a single person. Once the initial design is complete and fairly robust, the real test begins as people with many different viewpoints undertake their own experiments.</paragraph> <paragraph>Coming back to the use of typefaces in electronic publishing: many of the new typographers receive their knowledge and information about the rules of typography from books, from computer magazines or the instruction manuals which they get with the purchase of a PC or software. There is not so much basic instruction, as of now, as there was in the old days, showing the differences between good and bad typographic design. Many people are just fascinated by their PC’s tricks, and think that a widely--praised program, called up on the screen, will make everything automatic from now on.</paragraph> </sectioncontent> </section> </sectioncontent> </section> <section detail="chapter" chain="chapter" level="2"> <sectionnumber>2</sectionnumber> <sectiontitle>Quoth<descriptionsymbol detail="footnote"><sup>1</sup></descriptionsymbol></sectiontitle> <sectioncontent> <lines detail="lines" chain="lines"> <line><delimited detail="quotation-1">“Prophet!”</delimited> said I, <delimited detail="quotation-1">“thing of evil!—prophet still, if bird or devil!”</delimited><line>By that Heaven that bends above us—by that God we both adore—</line><line>Tell this soul with sorrow laden if, within the distant Aidenn,</line><line>It shall clasp a sainted maiden whom the angels name Lenore—</line><line>Clasp a rare and radiant maiden whom the angels name Lenore.</line></line> <line><highlight detail="emph">Quoth the Raven <delimited detail="quotation-1">“Nevermore.”</delimited></highlight></line> </lines> </sectioncontent> <description detail="footnote" chain="footnote"> <descriptiontag><sup>1</sup> </descriptiontag> <descriptioncontent>by Edgar Allan Poe</descriptioncontent> </description> </section> </document> <break/>
tag.xhtml
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!-- input filename : minimal processing date : Sat Jan 17 19:42:37 2015 context version : 2014.12.29 10:01 exporter version : 0.33 --> <?xml-stylesheet type="text/css" href="styles/minimal-defaults.css" ?> <?xml-stylesheet type="text/css" href="styles/minimal-images.css" ?> <?xml-stylesheet type="text/css" href="styles/minimal-styles.css" ?> <document title="My first eBook 1" version="0.33" context="2014.12.29 10:01" href="minimal" author="{Hans 1} " xmlns:m="http://www.w3.org/1998/Math/MathML" file="minimal" language="en" date="Sat Jan 17 19:42:37 2015"> <section chain="chapter" detail="chapter" level="2"> <metadata> <metavariable name="author">Hans 3</metavariable> <metavariable name="subtitle"/> <metavariable name="title">My first eBook 3</metavariable> <metavariable name="version">\date </metavariable> </metadata> <sectionnumber>1</sectionnumber> <sectiontitle>Example</sectiontitle> <sectioncontent> <paragraph>We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats.</paragraph> <section chain="section" detail="section" level="3"> <sectionnumber>1.1</sectionnumber> <sectiontitle>A section</sectiontitle> <sectioncontent> <paragraph>We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats. <itemgroup detail="itemize" symbol="1" chain="itemize" packed="yes" level="1"><item><itemtag><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></itemtag><itemcontent>First</itemcontent></item><item><itemtag><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></itemtag><itemcontent>Second</itemcontent></item><item><itemtag><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></itemtag><itemcontent>Third</itemcontent></item><item><itemtag><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></itemtag><itemcontent>Fourth</itemcontent></item></itemgroup> </paragraph> <paragraph>Thus, I came to the conclusion that the designer of a new system must not only be the implementer and first large--scale user; the designer should also write the first user manual. <break/> The separation of any of these four components would have hurt TEX significantly. If I had not participated fully in all these activities, literally hundreds of improvements would never have been made, because I would never have thought of them or perceived why they were important. <break/> But a system cannot be successful if it is too strongly influenced by a single person. Once the initial design is complete and fairly robust, the real test begins as people with many different viewpoints undertake their own experiments.</paragraph> <paragraph>Coming back to the use of typefaces in electronic publishing: many of the new typographers receive their knowledge and information about the rules of typography from books, from computer magazines or the instruction manuals which they get with the purchase of a PC or software. There is not so much basic instruction, as of now, as there was in the old days, showing the differences between good and bad typographic design. Many people are just fascinated by their PC’s tricks, and think that a widely--praised program, called up on the screen, will make everything automatic from now on.</paragraph> </sectioncontent> </section> </sectioncontent> </section> <section chain="chapter" detail="chapter" level="2"> <sectionnumber>2</sectionnumber> <sectiontitle>Quoth<descriptionsymbol detail="footnote"><sup>1</sup></descriptionsymbol></sectiontitle> <sectioncontent> <lines chain="lines" detail="lines"> <line><delimited detail="quotation-1">“Prophet!”</delimited> said I, <delimited detail="quotation-1">“thing of evil!—prophet still, if bird or devil!”</delimited><line>By that Heaven that bends above us—by that God we both adore—</line><line>Tell this soul with sorrow laden if, within the distant Aidenn,</line><line>It shall clasp a sainted maiden whom the angels name Lenore—</line><line>Clasp a rare and radiant maiden whom the angels name Lenore.</line></line> <line><highlight detail="emph">Quoth the Raven <delimited detail="quotation-1">“Nevermore.”</delimited></highlight></line> </lines> </sectioncontent> <description chain="footnote" detail="footnote"> <descriptiontag><sup>1</sup> </descriptiontag> <descriptioncontent>by Edgar Allan Poe</descriptioncontent> </description> </section> </document>
div.xhtml
<?xml version="1.0" encoding="UTF-8" standalone="no" ?> <!-- input filename : minimal processing date : Sat Jan 17 19:42:37 2015 context version : 2014.12.29 10:01 exporter version : 0.33 --> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:math="http://www.w3.org/1998/Math/MathML"> <head> <meta charset="utf-8"/> <title>My first eBook 1</title> <link type="text/css" rel="stylesheet" href="styles/minimal-defaults.css" /> <link type="text/css" rel="stylesheet" href="styles/minimal-images.css" /> <link type="text/css" rel="stylesheet" href="styles/minimal-styles.css" /> </head> <body> <div class="warning">Rendering can be suboptimal because there is no default/fallback css loaded.</div> <div> <div class="section chapter level-2"> <div class="metadata"> <div class="metavariable">Hans 3</div> <div class="metavariable"><!--empty--> </div> <div class="metavariable">My first eBook 3</div> <div class="metavariable">\date </div> </div> <div class="sectionnumber">1</div> <div class="sectiontitle">Example</div> <div class="sectioncontent"> <div class="paragraph">We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats.</div> <div class="section level-3"> <div class="sectionnumber">1.1</div> <div class="sectiontitle">A section</div> <div class="sectioncontent"> <div class="paragraph">We thrive in information--thick worlds because of our marvelous and everyday capacity to select, edit, single out, structure, highlight, group, pair, merge, harmonize, synthesize, focus, organize, condense, reduce, boil down, choose, categorize, catalog, classify, list, abstract, scan, look into, idealize, isolate, discriminate, distinguish, screen, pigeonhole, pick over, sort, integrate, blend, inspect, filter, lump, skip, smooth, chunk, average, approximate, cluster, aggregate, outline, summarize, itemize, review, dip into, flip through, browse, glance into, leaf through, skim, refine, enumerate, glean, synopsize, winnow the wheat from the chaff and separate the sheep from the goats. <div class="itemgroup itemize symbol-1 packed-yes level-1"><div class="item"><div class="itemtag"><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></div><div class="itemcontent">First</div></div><div class="item"><div class="itemtag"><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></div><div class="itemcontent">Second</div></div><div class="item"><div class="itemtag"><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></div><div class="itemcontent">Third</div></div><div class="item"><div class="itemtag"><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></div><div class="itemcontent">Fourth</div></div></div> </div> <div class="paragraph">Thus, I came to the conclusion that the designer of a new system must not only be the implementer and first large--scale user; the designer should also write the first user manual. <div class="break"><!--empty--> </div> The separation of any of these four components would have hurt TEX significantly. If I had not participated fully in all these activities, literally hundreds of improvements would never have been made, because I would never have thought of them or perceived why they were important. <div class="break"><!--empty--> </div> But a system cannot be successful if it is too strongly influenced by a single person. Once the initial design is complete and fairly robust, the real test begins as people with many different viewpoints undertake their own experiments.</div> <div class="paragraph">Coming back to the use of typefaces in electronic publishing: many of the new typographers receive their knowledge and information about the rules of typography from books, from computer magazines or the instruction manuals which they get with the purchase of a PC or software. There is not so much basic instruction, as of now, as there was in the old days, showing the differences between good and bad typographic design. Many people are just fascinated by their PC’s tricks, and think that a widely--praised program, called up on the screen, will make everything automatic from now on.</div> </div> </div> </div> </div> <div class="section chapter level-2"> <div class="sectionnumber">2</div> <div class="sectiontitle">Quoth<div class="descriptionsymbol footnote"><div class="sup">1</div></div></div> <div class="sectioncontent"> <div class="lines"> <div class="line"><div class="delimited quotation-1">“Prophet!”</div> said I, <div class="delimited quotation-1">“thing of evil!—prophet still, if bird or devil!”</div><div class="line">By that Heaven that bends above us—by that God we both adore—</div><div class="line">Tell this soul with sorrow laden if, within the distant Aidenn,</div><div class="line">It shall clasp a sainted maiden whom the angels name Lenore—</div><div class="line">Clasp a rare and radiant maiden whom the angels name Lenore.</div></div> <div class="line"><div class="highlight emph">Quoth the Raven <div class="delimited quotation-1">“Nevermore.”</div></div></div> </div> </div> <div class="description footnote"> <div class="descriptiontag"><div class="sup">1</div> </div> <div class="descriptioncontent">by Edgar Allan Poe</div> </div> </div> </div> </body> </html>
(WORK IN PROGRESS)
Export options
From back-exp.mkiv:
\setupexport[ align=\raggedstatus, bodyfont=\bodyfontsize, width=\textwidth, title={}, % from interaction subtitle={}, % from interaction author={}, % from interaction firstpage=, % imagename lastpage=, % imagename alternative=, % html or div properties=no, hyphen=no, svgstyle=, cssfile=, ]
- The options align, bodyfont and width end up in the exported CSS.
- title, subtitle and author default to those from \setupinteraction
- firstpage, lastpage: cover image (end up only in pub.lua and is handled by ePub script)
- hyphen: yes/no; include invisible hyphenation characters (soft hyphen,
­
) at every possible place - alternative: html or div (influence on html export style? where?)
- properties: no: ignore, yes: as attribute, otherwise: use as prefix (used where?)
- svgstyle: maybe compression?
- cssfile: file name of additional CSS file
Solutions
Missing Hyphens
Problem: If you setup export, hyphen signs go missing.
Reason: Your font is missing a soft hyphen glyph at x00A0.
Workaround:
\enabledirectives[otf.checksofthyphen]
Suppressing Presentation Forms in Arabic Script Fonts
As is well known, Arabic script requires contextual analysis: Given a character, it takes a different form depending on its position within a word or other continuous string of characters. Certain pre-Unicode conventions encoded each of these different forms for use in certain ancient typesetting applications. Those encodings are preserved as part of the Arabic Presentation Forms B block in Unicode (U+FE7 - U+FEF block). But this is a legacy encoding: These codepoints should never be used to prepare fresh Unicode documents. Rather, Arabic script characters should be encoded primarily using codepoints from the standard Arabic block (U+600-6FF). Contextual forms of these characters are called upon during OpenType processing, but they do not take separate codepoints.
Unfortunately, certain Arabic fonts such as Linotype Lotus – it is a staple of the Middle East publishing industry – give the contextual forms of standard Arabic-script characters Unicode names that correspond to codepoints from Arabic Presentation Forms B. This saves space in the font, and for some purposes it is innocuous. But for, e.g., ConTeXt processing, many of the original, standard codepoints in the input are replaced by presentation-form codepoints in the output. For example:
مِّنَ السَّمَاءِ وَالْأَرْضِ
becomes
ﻣﱢﻦَ اﻟﺴﱠﻤَﺎءِ وَاﻷَْرْضِ
If one looks carefully, one will see that there are errors in the output. (If one uses the font almfixed in a text editor that supports Unicode and Arabic script you will easily see the differences).
When exporting to XML, this issue become a serious problem, for the exported text will not use the same codepoints as the input. To get around this issue, use something along the lines of the following in your preamble:
% private typescript that combines TeX Gyre Termes for Latin with Linotype Lotus for Arabic \usetypescriptfile[type-times-lotus] \usetypescript[times-lotus] % choose your desired features for pdf output \definefontfeature [lotus-default] [mode=node,language=dflt,script=arab, init=yes,medi=yes,fina=yes,isol=yes, liga=no,rlig=yes,trep=yes,tlig=yes, mark=yes,ccmp=yes] % use this in export mode \definefontfeature [lotus-default] [mode=none] % setup the bodyfont last; the order is important! \setupbodyfont[times-lotus,12pt]
So mode=none will suppress the contextual analysis, bypass the presentation-forms codepoints, and give the original input in the exported XML.
Missing Images
MetaPost images can be exported by wrapping them in \startimage
and \stopimage
, then running
mtxrun --script epub --images <tex_file> context <tex_file>
For instance, the following code
\startimage \startMPcode fill fullcircle scaled 1cm withcolor blue ; \stopMPcode \stopimage
will export an SVG file with a blue disc, which a web browser can load alongside the html
or xhtml file
.
This works with other content too. For instance,
\startimage \startformula e = \frac{m_0 c^2}{\sqrt{1 - v^2 / c^2}} \stopformula \stopimage
generates an SVG file with the well-known equation.
More TODO
- handling of images
- which files get overwritten, which stay
Open Issues
as of 2014-01-20, updated 2015-08-08/-15 and 2017-11-11
- FIXED
Structure bug: Metadata ends up within the first section instead of in front of everything - FIXED
Names of metavariables are missing in div.xhtml(could be better, but not that important) - Notes (footnotes): Only visual formatting, no semantical markup and no reference/ID
- FIXED
Delimited: Quotations have tagging and quotation marks, even in raw.xml(now marks have their own tags) - Firstpage/Lastpage (cover setup) is ignored in project structure
- Minimal example doesn’t create a cover at all
- \color leaves no trace in export.
- Spacing characters like \, get lost.
- Additional commas in export of register ranges (1,–,5 instead of 1–5).
- There’s no marking of page breaks (ePub should have page break markers of the PDF for scientific quotability).