Changes

Jump to navigation Jump to search
190 bytes removed ,  19:11, 5 September 2020
m
typo
{{todo|This page documents the situation that will become active in a few days/a week when the newly developed wiki extension will be ported over from the test wiki. This page is written in preparation. Soon, this todo block will be deleted.
 
We have a heatwave right now, and even though there are some things in the extension that I still want to improve on, it is too hot in the actual office to do any programming. Instead, I am on the couch in front of a fan.
 
--[[User:Taco|Taco]] ([[User talk:Taco|talk]]) 19:48, 10 August 2020 (CEST)}}
== An extension for editing `/Command` subpages ==
The ConTeXtXML extension is a new wiki feature specifically designed to edit the ConTeXt command referencs reference pages (the ones that live under the `/Command/` URL.
It does this by intercepting the creation of new wiki pages below `/Command/`, and using a ContentHandler extension to maintain those pages. The text model of those pages is `contextxml`, which is a special XML format developed for documenting ConTeXt commands that is based in the interface XML files by Wolfgang Schuster.
Sets the content model to `contextxml` if the wiki page title starts with `/Command`.
 
=== `PageContentSave` ===
 
On save, this saves `contextxml` pages to a designated harddisk location as well as in the wiki database.
=== `ArticleAfterFetchContentObject` ===
This fills the edit area for newly created `/Command` pages from the file on the harddisk
 
=== `EditPageNoSuchSection` ===
 
Error hook that is triggered if the user tried to edit a section that is generated from wiki code instead of from the XML data. This is an error because it is quite hard to extract the right block of text in that case and still keep track of where it is in relation to the XML data.
 
=== `EditPage::showEditForm:fields` ===
 
Prints a simple help message at the top of the edit field for `/Command` pages.
 
== Generating the wikitext code for page views and previews ==
== Implementation notes ==
 
=== Command disk files ===
 
The extension has three types of data files on the filesystem:
 
* XML files for command definitions
* Verification tables for command definitions
* Wiki text files for instance pages
 
Generally, the file names follow the logic of the wiki page title, except with the prefix <code>cmd-</code> instead of <code>/Command</code>.
 
The file extension for the XML files is <code>.xml</code>, the file extension for the verification table lua dump is <code>-test.lua</code>, the file extension for instance pages (redirects) is <code>.wiki</code>
 
However, in order to appease case-preserving and case-sensitive file systems, all uppercase letters in the filename are prefixed with a <code>^</code> character. A simple example: <code>Command/WEEKDAY</code> is stored on disk as <code>cmd-^W^E^E^K^D^A^Y.xml</code>, and its verification table is stored in <code>cmd-^W^E^E^K^D^A^Y-test.lua</code>.
 
=== XML parser ===
The extension uses a hardwritten simple XML parser implemented in pure LusLua. The parser is expat-style and the implementation is based on string.find() and string.sub(). The advantage of this approach is that it can handle bad XML input by throwing an appropriate (and understandable) error. Neither the Lpeg-based Lua parser from the 13th ConTeXt meeting nor the ConTeXt built-in parser allows allow for that, both . Both those parsers assume well-formed XML as input.
A tailored parser also allowed for easy extension for to deal with the CDATA issue mentioned below.
But the main motivation for a private dedicated parser written in Lua is that we want to be able to not only check the well-formedness of the XML, but also its adherence to a set of extra rules:
The first point is handled like this:
*While When a fresh set of ‘virgin’ XML files is created from <code>context-en.xml</code>, each separate file is parsed using a set of expat callback functions that create a lua table representing the ‘virginal’ parse tree of the XML file. This Lua table is dumped to disk and distributed along with the XML file.
*When a wiki user presses the ‘Save’ button in the page editor, their edited XML is parsed using a slightly different set of expat callback functions from the ones for viewing. These altered callbacks functions in this set skip all documentation content while building the parse tree. The two lua tables representing the parse trees are then compared. They should be identical. If not, an error is raised and the save action is aborted with a user-visible error message.
The second point is taken care of during that same XML parse step of the user page revision. It uses a combination of a tag lookup table and string text matching to make sure the user followed the rules (as explained in [[Help:Command]]).
But it also sometimes backfires. If you use a XML tag name inside a <code><nowiki><context source="yes"></nowiki></code> call or within <code><nowiki><texcode></nowiki></code>, it will not be displayed in the verbatim display section of the page (but it will be seen by ConTeXt while processing the <code><nowiki><context></nowiki></code>).
To solve this question between 'is it data?' and 'is it markup>?' in a standalone XML file, you would wrap a CDATA section around things like the content of <code><nowiki><xmlcode></nowiki></code>. But unfortunately that is something that either the mediawiki parser or the <code>context</code> or the HTML browser does not understand (I don't know which is the exact problem).
For now, within the ConTeXtXML XML parser,I decided to treat the content of <code><nowiki><texcode></nowiki></code>, <code><nowiki><xmlcode></nowiki></code>, and <code><nowiki><context></nowiki></code> 'as if' they are SGML elements with data model CDATA. That means that the generated XML files on disk that make use of this feature are not actually well-formed, for example this content of <code><nowiki><xmlcode></nowiki></code>:

Navigation menu