Changes

Jump to navigation Jump to search
3,452 bytes added ,  00:04, 3 October 2010
Rewritten section on verbatim
This is more verbose than the TeX solution, but is easier to read and write. With a proper parser, I do not have to use tricks to check if either one or both <code>_</code> and <code>^</code> are present. More importantly, anyone (once they know the lpeg syntax) can read the parser and easily understand what it does. This is in contrast to the implementation based on TeX macro jugglery which require you to implement a TeX interpreter in your head to understand.
 
= Manipulating verbatim text for dummies =
 
Writing macros that manipulate verbatim text involve catcode finesse that only TeX wizards can master.
 
Consider a simple example. Suppose we want to write an environment <code>\startdedentedtyping</code> ... <code>\stopdedentedtyping</code> that removes the indentation of the first line from every line. Thus, the output of
 
<texcode>
\startdedentedtyping
#include <stdio.h>
void main()
{
print("Hello world \n") ;
}
\stopdedentedtyping
</texcode>
should be the same as the output of
 
<texcode>
\starttyping
#include <stdio.h>
void main()
{
print("Hello world \n") ;
}
\stoptyping
</texcode>
 
'''Note''' the difference in whitespace at the beginning of each line.
 
I don't even know how to write an environment that will remove the leading spaces but leave other spaces untouched. On the other hand, once we capture the contents of the environment, removing the leading indent or ''dedenting'' the content in lua is easy. Here is a lua function that uses lpeg.
 
<texcode>
\startluacode
userdata = userdata or {}
function userdata.dedentedtyping(content)
local newline = lpeg.P("\n\r") + lpeg.P("\r\n") + lpeg.P("\n") + lpeg.P("\r")
local splitter = lpeg.Ct(lpeg.splitat(newline))
local lines = lpeg.match(splitter, content)
 
local indent = string.match(lines[1], '^ +') or ''
indent = lpeg.P(indent)
local any = lpeg.Cs(1)
local parser = indent * lpeg.C(any^0)
 
for i=1,#lines do
lines[i] = lpeg.match(parser, lines[i])
end
 
content = table.concat(lines,'\n')
 
-- context.starttyping()
-- context(content)
-- context.stoptyping)
-- does not work.
tex.sprint("\\starttyping\n" .. content .. "\\stoptyping\n")
end
\stopluacode</texcode>
The only hard part is capturing the content of the environment and passing it to lua. As explained in [[Inside_ConTeXt#Passing_verbatim_text_as_macro_parameter|Inside ConText]], the trick to capturing the content of an environment verbatim is to ensure that spaces and newlines have a catcode that makes them significant. This is done using <code>\obeyspaces</code> and <code>\obeylines</code>. Using that trick, we can write this macro as
 
<texcode>\unprotect
\def\startdedentedtyping%
{\begingroup
\obeyspaces \obeylines
\long\def\dostartdedentedtyping##1\stopdedentedtyping%
{\ctxlua{userdata.dedentedtyping(\!!bs \detokenize{##1} \!!es)}%
\endgroup}%
\dostartdedentedtyping}
\protect</texcode>
The above macro works for simple cases, but there are some limitations. For example, there is an extra space of <code>\n</code> in the output. This macro will also fail if the contents have unbalanced braces (try removing the <code>}</code> from the example.
 
A more robust solution is to use ConTeXt's built in support for buffers. Using buffers, the above macro can be written as
 
<texcode>\def\startdedentedtyping
{\dostartbuffer[dedentedtyping][dostartdedentedtyping][stopdedentedtyping]}
 
\def\stopdedentedtyping
{\ctxlua{userdata.dedentedtyping(buffers.content('dedentedtyping'))}}</texcode>
Unlike MkII, where the contents of a buffer were written to an external file, in MkIV buffers are stored in memory. Thus, with luatex is really simple to manipulate verbatim text: write the text manipulating code in lua and pass the contents of the environment to the lua function using buffers.
 
= Conclusion =

Navigation menu