Changes

Jump to navigation Jump to search
17,528 bytes added ,  09:14, 6 March 2020
hint about export from output, minor enhancements
That means, you need to mark ''everything,'' from markup spans over paragraphs and enumeration items to chapters and parts with {{code|1=\start... … \stop...}}.
Also note that switches like {{cmd|em}} don’t translate into output structure, you need to {{cmd|definehighlight|2=[emph][style=italic{\em}]}} and use as {{code|1=\emph{emphasized}.}} '''Beware: The exported XML contains the structure of the ''output'', i.e. it lacks some information that might be in your code.''' E.g. {{cmd|index}} creates an empty <tt>registerlocation</tt> anchor, while the content appears in <tt>registerentry</tt> structures where you placed your index.
= More useful example =
]
\definehighlight[emph][style=italicitalicface] % use \emph{something} instead of {\em something}
\starttext
<xmlcode>
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
 
<!--
 
input filename : minimal
processing date : Sat Jan 17 19:42:37 2015
context version : 2014.12.29 10:01
exporter version : 0.33
 
-->
<?xml-stylesheet type="text/css" href="styles/minimal-defaults.css" ?>
<?xml-stylesheet type="text/css" href="styles/minimal-images.css" ?>
<?xml-stylesheet type="text/css" href="styles/minimal-styles.css" ?>
 
<document xmlns:m="http://www.w3.org/1998/Math/MathML" title="My first eBook 1" version="0.33" author="{Hans 1} " context="2014.12.29 10:01" date="Sat Jan 17 19:42:37 2015" language="en" file="minimal">
<section detail="chapter" chain="chapter" level="2">
</document>
<break/>
</xmlcode>
 
== tag.xhtml ==
 
<xmlcode>
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!--
input filename : minimal
processing date : Sat Jan 17 19:42:37 2015
context version : 2014.12.29 10:01
exporter version : 0.33
-->
<?xml-stylesheet type="text/css" href="styles/minimal-defaults.css" ?>
<?xml-stylesheet type="text/css" href="styles/minimal-images.css" ?>
<?xml-stylesheet type="text/css" href="styles/minimal-styles.css" ?>
 
<document title="My first eBook 1" version="0.33" context="2014.12.29 10:01" href="minimal" author="{Hans 1} " xmlns:m="http://www.w3.org/1998/Math/MathML" file="minimal" language="en" date="Sat Jan 17 19:42:37 2015">
<section chain="chapter" detail="chapter" level="2">
<metadata>
<metavariable name="author">Hans 3</metavariable>
<metavariable name="subtitle"/>
<metavariable name="title">My first eBook 3</metavariable>
<metavariable name="version">\date </metavariable>
</metadata>
<sectionnumber>1</sectionnumber>
<sectiontitle>Ex­am­ple</sectiontitle>
<sectioncontent>
<paragraph>We thrive in in­for­ma­tion--thick worlds be­cause of our mar­velous and every­day ca­pac­ity to se­lect, edit, sin­gle out, struc­ture, high­light, group, pair, merge, har­mo­nize, syn­the­size, fo­cus, or­ga­nize, con­dense, re­duce, boil down, choose, cat­e­go­rize, cat­a­log, clas­sify, list, ab­stract, scan, look into, ide­al­ize, iso­late, dis­crim­i­nate, dis­tin­guish, screen, pi­geon­hole, pick over, sort, in­te­grate, blend, in­spect, fil­ter, lump, skip, smooth, chunk, av­er­age, ap­prox­i­mate, clus­ter, ag­gre­gate, out­line, sum­ma­rize, item­ize, re­view, dip into, flip through, browse, glance into, leaf through, skim, re­fine, enu­mer­ate, glean, syn­op­size, win­now the wheat from the chaff and sep­a­rate the sheep from the goats.</paragraph>
<section chain="section" detail="section" level="3">
<sectionnumber>1.1</sectionnumber>
<sectiontitle>A sec­tion</sectiontitle>
<sectioncontent>
<paragraph>We thrive in in­for­ma­tion--thick worlds be­cause of our mar­velous and every­day ca­pac­ity to se­lect, edit, sin­gle out, struc­ture, high­light, group, pair, merge, har­mo­nize, syn­the­size, fo­cus, or­ga­nize, con­dense, re­duce, boil down, choose, cat­e­go­rize, cat­a­log, clas­sify, list, ab­stract, scan, look into, ide­al­ize, iso­late, dis­crim­i­nate, dis­tin­guish, screen, pi­geon­hole, pick over, sort, in­te­grate, blend, in­spect, fil­ter, lump, skip, smooth, chunk, av­er­age, ap­prox­i­mate, clus­ter, ag­gre­gate, out­line, sum­ma­rize, item­ize, re­view, dip into, flip through, browse, glance into, leaf through, skim, re­fine, enu­mer­ate, glean, syn­op­size, win­now the wheat from the chaff and sep­a­rate the sheep from the goats. <itemgroup detail="itemize" symbol="1" chain="itemize" packed="yes" level="1"><item><itemtag><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></itemtag><itemcontent>First</itemcontent></item><item><itemtag><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></itemtag><itemcontent>Sec­ond</itemcontent></item><item><itemtag><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></itemtag><itemcontent>Third</itemcontent></item><item><itemtag><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></itemtag><itemcontent>Fourth</itemcontent></item></itemgroup> </paragraph>
<paragraph>Thus, I came to the con­clu­sion that the de­signer of a new sys­tem must not only be the im­ple­menter and first large--scale user; the de­signer should also write the first user man­ual. <break/>
The sep­a­ra­tion of any of these four com­po­nents would have hurt TEX sig­nif­i­cantly. If I had not par­tic­i­pated fully in all these ac­tiv­i­ties, lit­er­ally hun­dreds of im­prove­ments would never have been made, be­cause I would never have thought of them or per­ceived why they were im­por­tant. <break/>
But a sys­tem can­not be suc­cess­ful if it is too strongly in­flu­enced by a sin­gle per­son. Once the ini­tial de­sign is com­plete and fairly ro­bust, the real test be­gins as peo­ple with many dif­fer­ent view­points un­der­take their own ex­per­i­ments.</paragraph>
<paragraph>Com­ing back to the use of type­faces in elec­tronic pub­lish­ing: many of the new ty­pog­ra­phers re­ceive their knowl­edge and in­for­ma­tion about the rules of ty­pog­ra­phy from books, from com­puter mag­a­zines or the in­struc­tion man­u­als which they get with the pur­chase of a PC or soft­ware. There is not so much ba­sic in­struc­tion, as of now, as there was in the old days, show­ing the dif­fer­ences be­tween good and bad ty­po­graphic de­sign. Many peo­ple are just fas­ci­nated by their PC’s tricks, and think that a widely--praised pro­gram, called up on the screen, will make every­thing au­to­matic from now on.</paragraph>
</sectioncontent>
</section>
</sectioncontent>
</section>
<section chain="chapter" detail="chapter" level="2">
<sectionnumber>2</sectionnumber>
<sectiontitle>Quoth<descriptionsymbol detail="footnote"><sup>1</sup></descriptionsymbol></sectiontitle>
<sectioncontent>
<lines chain="lines" detail="lines">
<line><delimited detail="quotation-1">“Prophet!”</delimited> said I, <delimited detail="quotation-1">“thing of evil!—prophet still, if bird or devil!”</delimited><line>By that Heaven that bends above us—by that God we both adore—</line><line>Tell this soul with sor­row laden if, within the dis­tant Aidenn,</line><line>It shall clasp a sainted maiden whom the an­gels name Lenore—</line><line>Clasp a rare and ra­di­ant maiden whom the an­gels name Lenore.</line></line>
<line><highlight detail="emph">Quoth the Raven <delimited detail="quotation-1">“Nev­er­more.”</delimited></highlight></line>
</lines>
</sectioncontent>
<description chain="footnote" detail="footnote">
<descriptiontag><sup>1</sup> </descriptiontag>
<descriptioncontent>by Edgar Al­lan Poe</descriptioncontent>
</description>
</section>
</document>
</xmlcode>
 
== div.xhtml ==
<xmlcode>
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<!--
input filename : minimal
processing date : Sat Jan 17 19:42:37 2015
context version : 2014.12.29 10:01
exporter version : 0.33
-->
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:math="http://www.w3.org/1998/Math/MathML">
<head>
<meta charset="utf-8"/>
<title>My first eBook 1</title>
<link type="text/css" rel="stylesheet" href="styles/minimal-defaults.css" />
<link type="text/css" rel="stylesheet" href="styles/minimal-images.css" />
<link type="text/css" rel="stylesheet" href="styles/minimal-styles.css" />
</head>
<body>
<div class="warning">Rendering can be suboptimal because there is no default/fallback css loaded.</div>
<div>
<div class="section chapter level-2">
<div class="metadata">
<div class="metavariable">Hans 3</div>
<div class="metavariable"><!--empty--></div>
<div class="metavariable">My first eBook 3</div>
<div class="metavariable">\date </div>
</div>
<div class="sectionnumber">1</div>
<div class="sectiontitle">Ex­am­ple</div>
<div class="sectioncontent">
<div class="paragraph">We thrive in in­for­ma­tion--thick worlds be­cause of our mar­velous and every­day ca­pac­ity to se­lect, edit, sin­gle out, struc­ture, high­light, group, pair, merge, har­mo­nize, syn­the­size, fo­cus, or­ga­nize, con­dense, re­duce, boil down, choose, cat­e­go­rize, cat­a­log, clas­sify, list, ab­stract, scan, look into, ide­al­ize, iso­late, dis­crim­i­nate, dis­tin­guish, screen, pi­geon­hole, pick over, sort, in­te­grate, blend, in­spect, fil­ter, lump, skip, smooth, chunk, av­er­age, ap­prox­i­mate, clus­ter, ag­gre­gate, out­line, sum­ma­rize, item­ize, re­view, dip into, flip through, browse, glance into, leaf through, skim, re­fine, enu­mer­ate, glean, syn­op­size, win­now the wheat from the chaff and sep­a­rate the sheep from the goats.</div>
<div class="section level-3">
<div class="sectionnumber">1.1</div>
<div class="sectiontitle">A sec­tion</div>
<div class="sectioncontent">
<div class="paragraph">We thrive in in­for­ma­tion--thick worlds be­cause of our mar­velous and every­day ca­pac­ity to se­lect, edit, sin­gle out, struc­ture, high­light, group, pair, merge, har­mo­nize, syn­the­size, fo­cus, or­ga­nize, con­dense, re­duce, boil down, choose, cat­e­go­rize, cat­a­log, clas­sify, list, ab­stract, scan, look into, ide­al­ize, iso­late, dis­crim­i­nate, dis­tin­guish, screen, pi­geon­hole, pick over, sort, in­te­grate, blend, in­spect, fil­ter, lump, skip, smooth, chunk, av­er­age, ap­prox­i­mate, clus­ter, ag­gre­gate, out­line, sum­ma­rize, item­ize, re­view, dip into, flip through, browse, glance into, leaf through, skim, re­fine, enu­mer­ate, glean, syn­op­size, win­now the wheat from the chaff and sep­a­rate the sheep from the goats. <div class="itemgroup itemize symbol-1 packed-yes level-1"><div class="item"><div class="itemtag"><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></div><div class="itemcontent">First</div></div><div class="item"><div class="itemtag"><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></div><div class="itemcontent">Sec­ond</div></div><div class="item"><div class="itemtag"><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></div><div class="itemcontent">Third</div></div><div class="item"><div class="itemtag"><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" display="inline"><m:mo>•</m:mo></m:math></div><div class="itemcontent">Fourth</div></div></div> </div>
<div class="paragraph">Thus, I came to the con­clu­sion that the de­signer of a new sys­tem must not only be the im­ple­menter and first large--scale user; the de­signer should also write the first user man­ual. <div class="break"><!--empty--></div>
The sep­a­ra­tion of any of these four com­po­nents would have hurt TEX sig­nif­i­cantly. If I had not par­tic­i­pated fully in all these ac­tiv­i­ties, lit­er­ally hun­dreds of im­prove­ments would never have been made, be­cause I would never have thought of them or per­ceived why they were im­por­tant. <div class="break"><!--empty--></div>
But a sys­tem can­not be suc­cess­ful if it is too strongly in­flu­enced by a sin­gle per­son. Once the ini­tial de­sign is com­plete and fairly ro­bust, the real test be­gins as peo­ple with many dif­fer­ent view­points un­der­take their own ex­per­i­ments.</div>
<div class="paragraph">Com­ing back to the use of type­faces in elec­tronic pub­lish­ing: many of the new ty­pog­ra­phers re­ceive their knowl­edge and in­for­ma­tion about the rules of ty­pog­ra­phy from books, from com­puter mag­a­zines or the in­struc­tion man­u­als which they get with the pur­chase of a PC or soft­ware. There is not so much ba­sic in­struc­tion, as of now, as there was in the old days, show­ing the dif­fer­ences be­tween good and bad ty­po­graphic de­sign. Many peo­ple are just fas­ci­nated by their PC’s tricks, and think that a widely--praised pro­gram, called up on the screen, will make every­thing au­to­matic from now on.</div>
</div>
</div>
</div>
</div>
<div class="section chapter level-2">
<div class="sectionnumber">2</div>
<div class="sectiontitle">Quoth<div class="descriptionsymbol footnote"><div class="sup">1</div></div></div>
<div class="sectioncontent">
<div class="lines">
<div class="line"><div class="delimited quotation-1">“Prophet!”</div> said I, <div class="delimited quotation-1">“thing of evil!—prophet still, if bird or devil!”</div><div class="line">By that Heaven that bends above us—by that God we both adore—</div><div class="line">Tell this soul with sor­row laden if, within the dis­tant Aidenn,</div><div class="line">It shall clasp a sainted maiden whom the an­gels name Lenore—</div><div class="line">Clasp a rare and ra­di­ant maiden whom the an­gels name Lenore.</div></div>
<div class="line"><div class="highlight emph">Quoth the Raven <div class="delimited quotation-1">“Nev­er­more.”</div></div></div>
</div>
</div>
<div class="description footnote">
<div class="descriptiontag"><div class="sup">1</div> </div>
<div class="descriptioncontent">by Edgar Al­lan Poe</div>
</div>
</div>
</div>
</body>
</html>
</xmlcode>
= Export options =
From {{src|back-exp.mkiv}}, is this still current?:
<texcode>
\setupexport[
bodyfont=\bodyfontsize,
width=\textwidth,
title={\directinteractionparameter\c!title}, % from interaction subtitle={\directinteractionparameter\c!subtitle}, % from interaction author={\directinteractionparameter\c!author}, % from interaction % firstpage=, % imagename % lastpage=, % imagename alternative=, % html or div properties=no,
hyphen=no,
svgstyle=,
cssfile=,
]
</texcode>
* The options align, bodyfont and width end up in the exported CSS.
* title, subtitle and author default to those from {{cmd|setupinteraction}}* firstpage, lastpage: cover image? (how?end up only in pub.lua and is handled by ePub script)* hyphen: yes/no; include invisible hyphenation characters ([http://en.wikipedia.org/wiki/Soft_hyphen soft hyphen], {{code|1=&amp;shy;}}) at every possible place* alternative: html or div (influence on html export style? where?)* properties: no: ignore, yes: as attribute, otherwise: use as prefix (used where?)* svgstyle: maybe compression?* cssfile: file name of ''additional'' CSS file
= Solutions =
 
==Missing Hyphens==
 
Problem: If you setup export, hyphen signs go missing.
 
Reason: Your font is missing a soft hyphen glyph at x00A0.
 
Workaround:
 
\enabledirectives[otf.checksofthyphen]
 
 
== Suppressing Presentation Forms in Arabic Script Fonts ==
 
As is well known, Arabic script requires contextual analysis: Given a character, it takes a different form depending on its position within a word or other continuous string of characters. Certain pre-Unicode conventions encoded each of these different forms for use in certain ancient typesetting applications. Those encodings are preserved as part of the Arabic Presentation Forms B block in Unicode (U+FE7 - U+FEF block). But this is a legacy encoding: These codepoints should <em>never</em> be used to prepare fresh Unicode documents. Rather, Arabic script characters should be encoded primarily using codepoints from the standard Arabic block (U+600-6FF). Contextual forms of these characters are called upon during OpenType processing, but they do not take separate codepoints.
 
Unfortunately, certain Arabic fonts such as Linotype Lotus – it is a staple of the Middle East publishing industry – give the contextual forms of standard Arabic-script characters Unicode names that correspond to codepoints from Arabic Presentation Forms B. This saves space in the font, and for some purposes it is innocuous. But for, e.g., ConTeXt processing, many of the original, standard codepoints in the input are replaced by presentation-form codepoints in the output. For example:
 
<texcode>مِّنَ السَّمَاءِ وَالْأَرْضِ</texcode>
 
becomes
 
<texcode>ﻣﱢﻦَ اﻟﺴﱠﻤَﺎءِ وَاﻷَْرْضِ</texcode>
 
If one looks carefully, one will see that there are errors in the output. (If one uses the font almfixed in a text editor that supports Unicode and Arabic script you will easily see the differences).
 
When exporting to XML, this issue become a serious problem, for the exported text will not use the same codepoints as the input. To get around this issue, use something along the lines of the following in your preamble:
 
<texcode>
% private typescript that combines TeX Gyre Termes for Latin with Linotype Lotus for Arabic
\usetypescriptfile[type-times-lotus]
\usetypescript[times-lotus]
 
% choose your desired features for pdf output
\definefontfeature
[lotus-default]
[mode=node,language=dflt,script=arab,
init=yes,medi=yes,fina=yes,isol=yes,
liga=no,rlig=yes,trep=yes,tlig=yes,
mark=yes,ccmp=yes]
 
% use this in export mode
 
\definefontfeature
[lotus-default]
[mode=none]
 
% setup the bodyfont last; the order is important!
\setupbodyfont[times-lotus,12pt]
</texcode>
 
So <tt>mode=none</tt> will suppress the contextual analysis, bypass the presentation-forms codepoints, and give the original input in the exported XML.
= More TODO =
* handling of images
* which files get overwritten, which stay
 
= Open Issues =
 
as of 2014-01-20, updated 2015-08-08/-15 and 2017-11-11
 
* FIXED <s>Structure bug: Metadata ends up within the first section instead of in front of everything</s>
* FIXED <s>Names of metavariables are missing in div.xhtml</s> (could be better, but not that important)
* Notes (footnotes): Only visual formatting, no semantical markup and no reference/ID
* FIXED <s>Delimited: Quotations have tagging ''and'' quotation marks, even in raw.xml</s> (now marks have their own tags)
* Firstpage/Lastpage (cover setup) is ignored in project structure
* Minimal example doesn’t create a cover at all
* {{cmd|color}} leaves no trace in export.
* Spacing characters like \, get lost.
* Additional commas in export of register ranges (1,–,5 instead of 1–5).
* There’s no marking of page breaks (ePub should have page break markers of the PDF for scientific quotability).

Navigation menu