pm39 tree sitter parser

Notes and such about the Tree-Sitter ConTeXt parser.

Features

Version 0.6 of the tree-sitter-context_en parser supports the following features:

Document Areas

If document start(text, component) and stop(text, component) commands exit in the document, the parser will build a tree with preamble, main, and postamble nodes. Dividing the document this way makes it easier for tools (that may want to ignore the postamble, for example).

If no start- or stop- commands exist in the document, all content is contained in a main node.

Commands

The parser tokenizes commands into:

name
zero or more option blocks (square brackets with keywords)
zero or more settings blocks (square brackets with key=val pairs)
zero or more scopes (curly braces after the command)

Settings are further tokenized into keys and values, with values able to contain other tokens (more commands, etc.).

Groups

The parser understands the following types of groupings:

Brace groups (starting with "{" or "\bgroup", and ending with "}" or "\egroup")
"Command" groups (starting with "\start" and ending with "\stop")

Inline Math

The parser supports minimal handling of inline math.

(Future work: more math support!)

Inclusions

Code Inclusions

The parser supports marking the following inclusions for inlined code:

luacode
tikzcode
MPinclusions
useMPgraphic
reuseableMPgraphic
MPcode
MPpage
staticMPfigure

Note that the parser will make these areas for external parsing, but nothing will happen if the external parser isn't available.

(As of this writing, an external parser exists for Lua, but not for MetaPost or TiKz.)

Typing Environment Inclusions

The parser supports marking the following typing environments:

MetaPost
Lua
HTML
CSS
XML
PARSEDXML

...and a generic typing inclusion.

Other Things

The parser marks commands relating to project structure.

The parser marks escaped characters (and will complain about unescaped characters that should be, except in special circumstances.)

The parser should be line-ending agnostic.

Future Directions

Parse and include more of the document structure in the syntax tree? (Reflect chapters, sections, etc. in the syntax tree? What to do about user-defined headings?)
Table support for the parser? (which model(s)?)
Better math support?
Better programming support? (Explicitly tag things like loop and branch commands?)
More inclusions? (Markdown?)
Other ConTeXt interface languages?
Should the parser be more strict about what's allowed in the preamble?

pm39 tree sitter parser

Contents

Features

Document Areas

Commands

Groups

Inline Math

Inclusions

Code Inclusions

Typing Environment Inclusions

Other Things

Future Directions

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Main

Navigation

Indexes

Interaction

Tools