User:Luigi.scarso/testpage

From Wiki
Jump to navigation Jump to search

The data­base

The bibTEX for­mat is rather pop­u­lar in the TEX com­mu­nity and even with its short­com­ings it will stay around for a while. Many pub­li­ca­tion web­sites can ex­port and many tools are avail­able to work with this data­base for­mat. It is rather sim­ple and looks a bit like Lua ta­bles. Un­for­tu­nately the con­tent can be pol­luted with non-stan­dard­ized TEX com­mands which com­pli­cates pre- or post­pro­cess­ing out­side TEX. In that sense a bibTEX data­base is of­ten not coded neu­trally. Some lim­i­ta­tions, like the use of com­mands to en­code ac­cented char­ac­ters root in the ascii world and can be by­passed by us­ing utf in­stead (as han­dled some­what in LATEX through ex­ten­sions such as bibtex8).

The nor­mal way to deal with a bib­li­og­ra­phy is to re­fer to en­tries us­ing a unique tag or key. When a list of en­tries is type­set, this ref­er­ence can be used for link­ing pur­poses. The type­set list can be processed and sorted us­ing the bibtex pro­gram that con­verts the data­base into some­thing more TEX friendly (a .bbl file). I never used the pro­gram my­self (nor bib­li­ogra­phies) so I will not go into too much de­tail here, if only be­cause all I say can be wrong.

In ConTEXt we no longer use the bibtex pro­gram: we just use data­base files and deal with the nec­es­sary ma­nip­u­la­tions di­rectly in ConTEXt. One or more such data­bases can be used and com­bined with ad­di­tional en­tries de­fined within the doc­u­ment. We can have sev­eral such datasets ac­tive at the same time.

A bibTEX file looks like this:

@Article{sometag,
    author  = "An Author and Another One",
    title   = "A hopefully meaningful title",
    journal = maps,
    volume  = "25",
    number  = "2",
    pages   = "5--9",
    month   = mar,
    year    = "2013",
    ISSN    = "1234-5678",
}

Nor­mally a value is given be­tween quotes (or curly brack­ets) but sin­gle words are also OK (there is no real ben­e­fit in not us­ing quotes, so we ad­vise to al­ways use them). There can be many more fields and in­stead of strings one can use pre­de­fined short­cuts. The ti­tle for ex­am­ple quite of­ten con­tains TEX macros. Some fields, like pages have funny char­ac­ters such as the en­dash (typ­i­cally as --) so we have a mix­ture of data and type­set­ting di­rec­tives. If you are cov­er­ing non--eng­lish ref­er­ences, you of­ten need char­ac­ters that are not in the ascii sub­set but ConTEXt is quite happy with utf. If your data­base file uses old-fash­ioned TEX ac­cent com­mands then these will be in­ter­nally con­verted au­to­mat­i­cally to utf. Com­mands (macros) are con­verted to an in­di­rect call, which is quite ro­bust.

The bibTEX files are loaded in mem­ory as Lua ta­ble but can be con­verted to xml so that we can ac­cess them in a more flex­i­ble way, but that is a sub­ject for spe­cial­ists.

In the old MkII setup we have two kinds of en­tries: the ones that come from the bibTEX run and user sup­plied ones. We no longer rely on bibTEX out­put but we do still sup­port the user sup­plied de­f­i­n­i­tions. These were in fact pre­pared in a way that suits the pro­cess­ing of bibTEX gen­er­ated en­tries. The next vari­ant re­flects the ConTEXt re­cod­ing of the old bibTEX out­put.

\startpublication[k=Hagen:Second,t=article,a={Hans Hagen},y=2013,s=HH01]
    \artauthor[]{Hans}[H.]{}{Hagen}
    \arttitle{Who knows more?}
    \journal{MyJournal}
    \pubyear{2013}
    \month{8}
    \volume{1}
    \issue{3}
    \issn{1234-5678}
    \pages{123--126}
\stoppublication

The split \artauthor fields are col­lapsed into a sin­gle author field as we deal with the split­ting later when it gets parsed in Lua. The \artauthor syn­tax is only kept around for back­ward com­pat­i­bil­ity with the pre­vi­ous use of bibTEX.

In the new setup we sup­port these vari­ants as well:

\startpublication[k=Hagen:Third,t=article]
    \author{Hans Hagen}
    \title{Who knows who?}
    ...
\stoppublication

and

\startpublication[tag=Hagen:Third,category=article]
    \author{Hans Hagen}
    \title{Who knows who?}
    ...
\stoppublication

and

\startpublication
    \tag{Hagen:Third}
    \category{article}
    \author{Hans Hagen}
    \title{Who knows who?}
    ...
\stoppublication

Be­cause in­ter­nally the en­tries are Lua ta­bles, we also sup­port load­ing of Lua based de­f­i­n­i­tions:

return {
    ["Hagen:First"] = {
        author   = "Hans Hagen",
        category = "article",
        issn     = "1234-5678",
        issue    = "3",
        journal  = "MyJournal",
        month    = "8",
        pages    = "123--126",
        tag      = "Hagen:First",
        title    = "Who knows nothing?",
        volume   = "1",
        year     = "2013",
    },
}

Files set up like this can be loaded too. The fol­low­ing xml in­put is rather close to this, and is also ac­cepted as in­put.

<?xml version="2.0" standalone="yes" ?>
<bibtex>
    <entry tag="Hagen:First" category="article">
        <field name="author">Hans Hagen</field>
        <field name="category">article</field>
        <field name="issn">1234-5678</field>
        <field name="issue">3</field>
        <field name="journal">MyJournal</field>
        <field name="month">8</field>
        <field name="pages">123--126</field>
        <field name="tag">Hagen:First</field>
        <field name="title">Who knows nothing?</field>
        <field name="volume">1</field>
        <field name="year">2013</field>
    </entry>
</bibtex>

Todo: Add some re­marks about load­ing End­Note and RIS for­mats, but first we need to com­plete the tag map­ping (on Alan’s plate).

So the user has a rather wide choice of for­mat­ting style for bib­li­og­ra­phy data­base files.

You can load more data than you ac­tu­ally need. Only en­tries that are re­ferred to ex­plic­itly through the \cite and \nocite com­mands will be shown in lists. We will cover these de­tails later.

Com­mands in en­tries

One un­for­tu­nate as­pect com­monly found in bibTEX files is that they of­ten con­tain TEX com­mands. Even worse is that there is no stan­dard on what these com­mands can be and what they mean, at least not for­mally, as bibTEX is a pro­gram in­tended to be used with many vari­ants of TEX style: plain, LATEX, and oth­ers. This means that we need to de­fine our use of these type­set­ting com­mands. How­ever, in most cases, they are just ab­bre­vi­a­tions or font switches and these are of­ten known. There­fore, ConTEXt will try to re­solve them be­fore re­port­ing an is­sue. In the log file there is a list of com­mands that has been seen in the loaded data­bases. For in­stance, load­ing tugboat.bib gives a long list of com­mands of which we show a small set here:

publications > start used btx commands
publications > standard CONTEXT 1 known
publications > standard ConTeXt 4 known
publications > standard TeXLive 3 KNOWN
publications > standard eTeX    1 known
publications > standard hbox    6 known
publications > standard sltt    1 unknown
publications > stop used btxcommands

You can de­fine un­known com­mands, or over­load ex­ist­ing de­f­i­n­i­tions in the fol­low­ing way:

\definebtxcommand\TUB {TUGboat}
\definebtxcommand\sltt{\tt}
\definebtxcommand\<#1>{\type{#1}}

Un­known com­mands do not stall pro­cess­ing, but their names are then type­set in a mono- spaced font so they prob­a­bly stand out for proof­read­ing. You can ac­cess the com­mands with \btxcommand{...}, as in:

commands like \btxcommand{MySpecialCommand} are handled in an indirect way

As this is an un­de­fined com­mand we get: “com­mands like MySpe­cial­Com­mand are han­dled in an in­di­rect way”.

??


Datasets

Nor­mally in a doc­u­ment you will use only one bib­li­o­graphic data­base, whether or not dis­trib­uted over mul­ti­ple files. Nev­er­the­less we sup­port mul­ti­ple data­bases as well which is why we talk of datasets in­stead. A dataset is loaded with the \usebtxdataset com­mand. Al­though cur­rently it is not nec­es­sary to de­fine a (de­fault) dataset you can best do this be­cause in the fu­ture we might pro­vide more op­tions. Here are some ex­am­ples:

\definebtxdataset[standard]
\usebtxdataset[standard][tugboat.bib]
\usebtxdataset[standard][mtx-bibtex-output.xml]
\usebtxdataset[standard][test-001-btx-standard.lua]

These three suf­fixes are un­der­stood by the loader. Here the dataset has the name standard and the three data­base files are merged, where later en­tries hav­ing the same tag over­load pre­vi­ous ones. De­f­i­n­i­tions in the doc­u­ment source (coded in TEX speak) are also added, and they are saved for suc­ces­sive runs. This means that if you load and de­fine en­tries, they will be known at a next run be­fore­hand, so that ref­er­ences to them are in­de­pen­dent of when load­ing and de­f­i­n­i­tions take place.

setup definition setupbtxdataset

setup definition definebtxdataset

setup definition usebtxdataset

In this doc­u­ment we use some ex­am­ple data­bases, so let’s load one of them now:

\definebtxdataset[example]
\usebtxdataset[example][mkiv-publications.bib]

You can ask for an overview of en­tries in a dataset with:

\showbtxdatasetfields[example]

this gives:

tag

cat­e­gory

fields

demo-001

book

au­thor in­dex ti­tle year

demo-002

book

cross­ref in­dex year

demo-003

book

au­thor com­ment in­dex ti­tle year

demo-004

book

au­thor com­ment in­dex ti­tle year

demo-005

book

au­thor doi in­dex pages se­r­ial ti­tle url year

You can set the cur­rent ac­tive dataset with

\setbtxdataset[standard]

but most pub­li­ca­tion-re­lated com­mands ac­cept op­tional ar­gu­ments that de­note the dataset and ref­er­ences to en­tries can be pre­fixed with a dataset iden­ti­fier.. More about that later.


Ren­der­ings

A list of pub­li­ca­tions can be ren­dered at any place in the doc­u­ment. A data­base can be much larger than needed for a doc­u­ment. The same is true for the fields that make up an en­try. Here is the list of fields that are cur­rently han­dled, but of course there can be ad­di­tional ones:

abstract, address, annotate, assignee, author, bibnumber, booktitle, chapter, comment, country, day, dayfiled, doi, edition, editor, eprint, howpublished, institution, isbn, issn, journal, key, keyword, keywords, language, lastchecked, month, monthfiled, names, nationality, note, notes, number, organization, pages, publisher, revision, school, series, size, title, type, url, volume, year, yearfiled

If you want to see what pub­li­ca­tions are in the data­base, the eas­i­est way is to ask for a com­plete list:

\definebtxrendering
  [example]
  [dataset=example,
   method=local,
   alternative=apa]
\placelistofpublications % \placebtxrendering
  [example]
  [criterium=all]

This gives:1 Ha­gen, H. and Ot­ten, T. (1996). Type­set­ting ed­u­ca­tion doc­u­ments2 Scarso, L. (2021). De­sign­ing high speed trains3 au­thor (year). ti­tle pages p.

The ren­der­ing it­self is some­what com­plex to set up be­cause we have not only many dif­fer­ent stan­dards but also many fields that can be set up. This means that there are sev­eral com­mands in­volved. Of­ten there is a pre­scribed style to ren­der bib­li­o­graphic de­scrip­tions, for ex­am­ple apa. A ren­der­ing is setup and de­fined with:

setup definition setupbtxrendering

setup definition definebtxrendering

And a list of such de­scrip­tions is gen­er­ated with:

setup definition placebtxrendering

A dataset can have all kind of en­tries:

article, book, booklet, conference, inbook, incollection, inproceedings, manual, mastersthesis, misc, phdthesis, proceedings, techreport, unpublished

Each has its own ren­der­ing vari­ant. To keep things sim­ple we have their set­tings sep­a­rated. How­ever, these set­tings are shared for all ren­der­ing al­ter­na­tives. In prac­tice this is sel­dom a prob­lem in a pub­li­ca­tion as only one ren­der­ing al­ter­na­tive will be ac­tive. If this be not suf­fi­cient, you can al­ways group lo­cal set­tings in a setup and hook that into the spe­cific ren­der­ing.

setup definition setupbtxlistvariant

setup definition definebtxlistvariant

Ex­am­ples of list vari­ants are:

setupbtxlistvariant : artauthor

no specific settings

setupbtxlistvariant : author

no specific settings

setupbtxlistvariant : editor

no specific settings

The ex­act ren­der­ing of list en­tries is de­ter­mined by the alternative key and de­faults to apa which uses de­f­i­n­i­tions from publ-imp-apa.mkiv. If you look at that file you will see that each cat­e­gory has its own setup. You may also no­tice that ad­di­tional tests are needed to make sure that empty fields don’t trig­ger sep­a­ra­tors and such.

There are a cou­ple of ac­ces­sors and helpers to get the job done. When you want to fetch a field from the cur­rent en­try you use \btxfield. In most cases you want to make sure this field has a value, for in­stance be­cause you don’t want fences or punc­tu­a­tion that be­longs to a field.

\btxdoif {title} {
    \bold{\btxfield{title}},
}

There are three test macros:

\btxdoifelse{fieldname}{action when found}{action when not found}
\btxdoif    {fieldname}{action when found}
\btxdoifnot {fieldname}                   {action when not found}

An ex­tra con­di­tional is avail­able for test­ing in­ter­ac­tiv­ity:

\btxdoifelseinteraction{action when true}{action when false}

In ad­di­tion there is also a con­di­tional \btxinteractive which is more ef­fi­cient, al­though in prac­tice ef­fi­ciency is not so im­por­tant here.

There are three com­mands to flush data:

\btxfield

fetch a ex­plicit field (e.g. year)

\btxdetail

fetch a de­rived field (e.g. short)

\btxflush

fetch a de­rived or ex­plicit field

Nor­mally you can use \btxfield or \btxflush as de­rived fields just like an­a­lyzed au­thor fields are flushed in a spe­cial way.

You can im­prove read­abil­ity by us­ing se­tups, for in­stance:

\btxdoifelse {author} {
    \btxsetup{btx:apa:author:yes}
} {
    \btxsetup{btx:apa:author:nop}
}

Keep in mind that nor­mally you don’t need to mess with de­f­i­n­i­tions like this be­cause stan­dard ren­der­ing styles are pro­vided. These styles use a few helpers that in­ject sym­bols but also take care of lead­ing and trail­ing spaces:

\btxspace

be­fore af­ter

\btxperiod

be­fore. af­ter

\btxcomma

be­fore, af­ter

\btxlparent

be­fore (af­ter

\btxrparent

be­fore) af­ter

\btxlbracket

be­fore [af­ter

\btxrbracket

be­fore] af­ter

So, the pre­vi­ous ex­am­ple setup can be rewrit­ten as:

\btxdoif {title} {
    \bold{\btxfield{title}}
    \btxcomma
}

There is a spe­cial com­mand for ren­der­ing a (com­bi­na­tion) of au­thors:

\btxflushauthor{author}
\btxflushauthor{editor}
\btxflushauthor[inverted]{editor}

In­stead of the last one you can also use:

\btxflushauthorinverted{editor}

You can use a (con­fig­urable) de­fault or pass di­rec­tives: Valid di­rec­tives are

con­ver­sion

ren­der­ing

inverted

the Frog jr, Ker­mit

invertedshort

the Frog jr, K

normal

Ker­mit, the Frog, jr

normalshort

K, the Frog, jr


Ci­ta­tions

Ci­ta­tions are ref­er­ences to bib­li­o­graphic en­tries that nor­mally show up in lists some­place in the doc­u­ment: at the end of a chap­ter, in an ap­pen­dix, at the end of an ar­ti­cle, etc. We dis­cussed the ren­der­ing of these lists in the pre­vi­ous chap­ter. A ci­ta­tion is nor­mally pretty short as its main pur­pose is to re­fer uniquely to a more de­tailed de­scrip­tion. But, there are sev­eral ways to re­fer, which is why the ci­ta­tion sub­sys­tem is con­fig­urable and ex­ten­si­ble. Just look at the fol­low­ing com­mands:

\cite[author][example::demo-003]
\cite[authoryear][example::demo-003]
\cite[authoryears][example::demo-003]
\cite[author][example::demo-003,demo-004]
\cite[authoryear][example::demo-003,demo-004]
\cite[authoryears][example::demo-003,demo-004]
\cite[author][example::demo-004,demo-003]
\cite[authoryear][example::demo-004,demo-003]
\cite[authoryears][example::demo-004,demo-003]
(Hans Ha­gen and Ton Ot­ten)
(Hans Ha­gen and Ton Ot­ten (1996))
(Hans Ha­gen and Ton Ot­ten, 1996)
(Hans Ha­gen and Ton Ot­ten, Luigi Scarso)
(Hans Ha­gen and Ton Ot­ten (1996), Luigi Scarso (2021))
(Hans Ha­gen and Ton Ot­ten, 1996, Luigi Scarso, 2021)
(Luigi Scarso, Hans Ha­gen and Ton Ot­ten)
(Luigi Scarso (2021), Hans Ha­gen and Ton Ot­ten (1996))
(Luigi Scarso, 2021, Hans Ha­gen and Ton Ot­ten, 1996)

The first ar­gu­ment is op­tional.

setup definition cite

You can tune the way a ci­ta­tion shows up:

\setupbtxcitevariant[author]     [sorttype=author,color=darkyellow]
\setupbtxcitevariant[authoryear] [sorttype=author,color=darkyellow]
\setupbtxcitevariant[authoryears][sorttype=author,color=darkyellow]
\cite[author][example::demo-004,demo-003]
\cite[authoryear][example::demo-004,demo-003]
\cite[authoryears][example::demo-004,demo-003]

Here we sort the au­thors and color the ci­ta­tion:

(Hans Ha­gen and Ton Ot­ten, Luigi Scarso)
(Hans Ha­gen and Ton Ot­ten (1996), Luigi Scarso (2021))
(Hans Ha­gen and Ton Ot­ten, 1996, Luigi Scarso, 2021)

For rea­sons of back­ward com­pat­i­bil­ity the \cite com­mand is a bit picky about spaces be­tween the two ar­gu­ments, of which the first is op­tional. This is a con­se­quence of al­low­ing its use with the key spec­i­fied be­tween curly brack­ets as is the tra­di­tional prac­tice. (We do en­cour­age users to adopt the more co­her­ent ConTEXt syn­tax by us­ing square brack­ets for key­words and re­serv­ing curly brack­ets to re­group text to be type­set.)

The \citation com­mand is syn­ony­mous but is more flex­i­ble with re­spect to spac­ing of its ar­gu­ments:

\citation[author]     [example::demo-004,demo-003]
\citation[authoryear] [example::demo-004,demo-003]
\citation[authoryears][example::demo-004,demo-003]

There is a whole bunch of cite op­tions and more can be eas­ily de­fined.

key

ren­der­ing

author

(au­thor)

authornum

[au­thor [btx er­ror 1]]

authoryear

(au­thor (year))

authoryears

(au­thor, year)

doi

[todo: doi]

key

[demo-005]

none

num

btx er­ror 1

page

pages

serial

[5]

short

[aut00]

type

[book]

url

[todo: url]

year

(year)

Be­cause we are deal­ing with data­base in­put and be­cause we gen­er­ally need to ma­nip­u­late en­tries, much of the work is del­e­gated to Lua. This makes it eas­ier to main­tain and ex­tend the code. Of course TEX still does the ren­der­ing. The ty­po­graphic de­tails are con­trolled by pa­ra­me­ters but not all are used in all vari­ants. As with most ConTEXt com­mands, it starts out with a gen­eral setup com­mand:

setup definition setupbtxcitevariant

On top of that we can de­fine in­stances that in­herit ei­ther from a given par­ent or from the top­most setup.

setup definition definebtxcitevariant

But, spe­cific vari­ants can have them over­loaded:

setupbtxcitevariant : author

right

)

middle

,

left

(

setupbtxcitevariant : authornum

right

]

middle

,

left

[

setupbtxcitevariant : authoryear

compress

yes

inbetween

,

right

)

middle

,

left

(

setupbtxcitevariant : authoryears

compress

yes

inbetween

,

right

)

middle

,

left

(

setupbtxcitevariant : doi

right

]

left

[

setupbtxcitevariant : key

right

]

left

[

setupbtxcitevariant : none

no specific settings

setupbtxcitevariant : num

compress

yes

inbetween

--

right

]

left

[

setupbtxcitevariant : page

inbetween

setupbtxcitevariant : serial

right

]

left

[

setupbtxcitevariant : short

right

]

left

[

setupbtxcitevariant : type

right

]

left

[

setupbtxcitevariant : url

right

]

left

[

setupbtxcitevariant : year

right

)

left

(

A ci­ta­tion vari­ant is de­fined in sev­eral steps and if you re­ally want to know the dirty de­tails, you should look into the publ-imp-*.mkiv files. Here we stick to the con­cept.

\startsetups btx:cite:author
    \btxcitevariant{author}
\stopsetups

You can over­load such se­tups if needed, but that only makes sense when you can­not con­fig­ure the ren­der­ing with pa­ra­me­ters. The \btxcitevariant com­mand is one of the build in ac­ces­sors and it calls out to Lua where more com­plex ma­nip­u­la­tion takes place if needed. If no ma­nip­u­la­tion is known, the field with the same name (if found) will be flushed. A com­mand like \btxcitevariant as­sumes that a dataset and spe­cific tag has been set. This is nor­mally done in the wrap­per macros, like \cite. For spe­cial pur­poses you can use these com­mands

\setbtxdataset[example]
\setbtxentry[hh2013]

But don’t ex­pect too much sup­port for such low level ren­der­ing con­trol.

Un­less you use criterium=all only pub­li­ca­tions that are cited will end up in the lists. You can force a ci­ta­tion into a list us­ing \usecitation, for ex­am­ple:

\usecitation[example::demo-004,demo-003]

This com­mand has two syn­onyms: \nocite and \nocitation so you can choose what­ever fits you best.

setup definition nocite


The LUA view

Be­cause we man­age data at the Lua end it is tempt­ing to ac­cess it there for other pur­poses. This is fine as long as you keep in mind that as­pects of the im­ple­men­ta­tion may change over time, al­though this is un­likely once the mod­ules be­come sta­ble.

The en­tries are col­lected in datasets and each set has a unique name. In this doc­u­ment we have the set named example. A dataset ta­ble has sev­eral fields, and prob­a­bly the one of most in­ter­est is the luadata field. Each en­try in this ta­ble de­scribes a pub­li­ca­tion:

t={ 
 ["author"]="Hans Hagen", 
 ["category"]="book", 
 ["index"]=1, 
 ["tag"]="demo-001", 
 ["title"]="\\btxcmd{BIBTEX}, the \\btxcmd{CONTEXT}\\ way", 
 ["year"]="2013", 
} 

This is publications.datasets.example.luadata["demo-001"]. There can be a com­pan­ion en­try in the par­al­lel details ta­ble.

t={ 
 ["author"]={ 
  { 
   ["firstnames"]={ "Hans" }, 
   ["initials"]={ "H" }, 
   ["original"]="Hans Hagen", 
   ["surnames"]={ "Hagen" }, 
   ["vons"]={}, 
  }, 
 }, 
 ["short"]="Hag13", 
} 

These de­tails are ac­cessed as publications.datasets.example.details["demo-001"] and by us­ing a sep­a­rate ta­ble we can over­load fields in the orig­i­nal en­try with­out los­ing the orig­i­nal.

You can loop over the en­tries us­ing reg­u­lar Lua code com­bined with MkIV helpers:

local dataset = publications.datasets.example
context.starttabulate { "|l|l|l|" }
for tag, entry in table.sortedhash(dataset.luadata) do
    local detail = dataset.details[tag] or { }
    context.NC() context.type(tag)
    context.NC() context(detail.short)
    context.NC() context(entry.title)
    context.NC() context.NR()
end
context.stoptabulate()

This re­sults in:

demo-001

Hag13

bibTEX, the ConTEXt way

demo-002

Hag14

bibTEX, the ConTEXt way

demo-003

HO96

Type­set­ting ed­u­ca­tion doc­u­ments

demo-004

Sca21

De­sign­ing high speed trains

demo-005

aut00

ti­tle


The XML view

The luadata ta­ble can be con­verted into an xml rep­re­sen­ta­tion. This is a fol­low up on ear­lier ex­per­i­ments with an xml-only ap­proach. I de­cided in the end to stick to a Lua ap­proach and pro­vide some sim­ple xml sup­port in ad­di­tion.

Once a dataset is ac­ces­si­ble as xml tree, you can use the reg­u­lar \xml... com­mands. We start with load­ing a dataset, in this case from just one file.

\usebtxdataset[tugboat][tugboat.bib]

The dataset has to be con­verted to xml:

\convertbtxdatasettoxml[tugboat]

The tree is now ac­ces­si­ble by its root ref­er­ence btx:tugboat. If we want sim­ple field ac­cess we can use a few se­tups:

\startxmlsetups btx:initialize
    \xmlsetsetup{#1}{bibtex|entry|field}{btx:*}
    \xmlmain{#1}
\stopxmlsetups
\startxmlsetups btx:field
    \xmlflushcontext{#1}
\stopxmlsetups
\xmlsetup{btx:tugboat}{btx:initialize}

The two se­tups are pre­de­fined in the core al­ready, but you might want to change them. They are ap­plied in for in­stance:

\starttabulate[|||]
    \NC \type {tag}   \NC \xmlfirst {btx:tugboat}
        {/bibtex/entry[string.find(@tag,'Hagen')]/attribute('tag')}
    \NC \NR
    \NC \type {title} \NC \xmlfirst {btx:tugboat}
        {/bibtex/entry[string.find(@tag,'Hagen')]/field[@name='title']}
    \NC \NR
\stoptabulate

tag

Ha­gen:TB17-1-54

title

PPCHTEX: type­set­ting chem­i­cal for­mu­las in TEX


\startxmlsetups btx:demo
    \xmlcommand
        {#1}
        {/bibtex/entry[string.find(@tag,'Hagen')][1]}{btx:table}
\stopxmlsetups
\startxmlsetups btx:table
\starttabulate[|||]
    \NC \type {tag}   \NC \xmlatt{#1}{tag} \NC \NR
    \NC \type {title} \NC \xmlfirst{#1}{/field[@name='title']} \NC \NR
\stoptabulate
\stopxmlsetups
\xmlsetup{btx:tugboat}{btx:demo}

tag

Ha­gen:TB17-1-54

title

PPCHTEX: type­set­ting chem­i­cal for­mu­las in TEX

Here is an­other ex­am­ple:

\startxmlsetups btx:row
    \NC \xmlatt{#1}{tag}
    \NC \xmlfirst{#1}{/field[@name='title']}
    \NC \NR
\stopxmlsetups
\startxmlsetups btx:demo
    \xmlfilter {#1} {
        /bibtex
        /entry[@category='article']
        /field[@name='author' and (find(text(),'Knuth') or find(text(),'DEK'))]
        /../command(btx:row)
    }
\stopxmlsetups
\starttabulate[|||]
    \xmlsetup{btx:tugboat}{btx:demo}
\stoptabulate

Knuth:TB10-1-31

Type­set­ting Con­crete Math­e­mat­ics

Knuth:TB10-1-8

TEX would find it dif­fi­cult …

Knuth:TB10-3-325

The new ver­sions of TEX and MF

Knuth:TB10-4-529

The er­rors of TEX

Knuth:TB11-1-13

Vir­tual Fonts: More Fun for Grand Wiz­ards

Knuth:TB11-2-165

Ex­er­cises for TEX: The Pro­gram

Knuth:TB11-4-489

The fu­ture of TEX and MF

Knuth:TB11-4-497

Arthur Lee Samuel, 1901--1990

Knuth:TB11-4-499

An­swers to Ex­er­cises for TEX: The Pro­gram

Knuth:TB12-2-313

Fixed-point glue set­ting: Er­rata

Knuth:TB14-4-387

Icons for TEX and MF

Knuth:TB17-1-29

Im­por­tant mes­sage re­gard­ing CM fonts

Knuth:TB2-3-5

The cur­rent state of things

Knuth:TB3-1-10

Fixed-point glue set­ting­Dash an ex­am­ple of WEB

Knuth:TB31-2-121

An Earth­shak­ing An­nounce­ment

Knuth:TB4-2-64

A note on hy­phen­ation

Knuth:TB5-1-4

TEX in­cunab­ula

Knuth:TB5-1-67

Com­ments on qual­ity in pub­lish­ing

Knuth:TB5-2-105

A course on MF pro­gram­ming

Knuth:TB6-1-36

Recipes and frac­tions

Knuth:TB7-2-101

The TEX logo in var­i­ous fonts

Knuth:TB7-2-95

Re­marks to cel­e­brate the pub­li­ca­tion of Com­put­ers & Type­set­ting

Knuth:TB8-1-14

Mix­ing right-to-left texts with left-to-right texts

Knuth:TB8-1-6

It hap­pened: an­nounce­ment of TEX 2.1

Knuth:TB8-1-73

Prob­lem for a Sat­ur­day af­ter­noon

Knuth:TB8-2-135

Fonts for dig­i­tal halftones

Knuth:TB8-2-210

Sat­ur­day morn­ing prob­lem­Dash so­lu­tion

Knuth:TB8-2-217

Re­ply: Print­ing out se­lected pages

Knuth:TB8-3-309

Macros for Jill

Knuth:TB9-2-152

A Punk Meta-Font

A more ex­ten­sive ex­am­ple is the fol­low­ing. Of course this as­sumes that you know what xml sup­port mech­a­nisms and macros are avail­able.

\startxmlsetups btx:getkeys
    \xmladdsortentry{btx}{#1}{\xmlfilter{#1}{/field[@name='author']/text()}}
    \xmladdsortentry{btx}{#1}{\xmlfilter{#1}{/field[@name='year'  ]/text()}}
    \xmladdsortentry{btx}{#1}{\xmlatt{#1}{tag}}
\stopxmlsetups
\startxmlsetups btx:sorter
    \xmlresetsorter{btx}
  % \xmlfilter{#1}{entry/command(btx:getkeys)}
    \xmlfilter{#1}{
        /bibtex
        /entry[@category='article']
        /field[@name='author' and find(text(),'Knuth')]
        /../command(btx:getkeys)}
    \xmlsortentries{btx}
    \starttabulate[||||]
        \xmlflushsorter{btx}{btx:entry:flush}
    \stoptabulate
\stopxmlsetups
\startxmlsetups btx:entry:flush
    \NC \xmlfilter{#1}{/field[@name='year'  ]/context()}
    \NC \xmlatt{#1}{tag}
    \NC \xmlfilter{#1}{/field[@name='author']/context()}
    \NC \NR
\stopxmlsetups
\xmlsetup{btx:tugboat}{btx:sorter}

1984

Knuth:TB5-1-67

Don Knuth

1984

Knuth:TB5-1-4

Don­ald E. Knuth

1984

Knuth:TB5-2-105

Don­ald E. Knuth

1985

Knuth:TB6-1-36

Don­ald E. Knuth

1986

Knuth:TB7-2-101

Don­ald E. Knuth

1987

Knuth:TB8-2-135

Don­ald E. Knuth

1987

Knuth:TB8-3-309

Don­ald E. Knuth

1988

Knuth:TB9-2-152

Don­ald E. Knuth

1989

Knuth:TB10-3-325

Don­ald E. Knuth

1989

Knuth:TB10-4-529

Don­ald E. Knuth

1990

Knuth:TB11-4-489

Don­ald E. Knuth

1993

Knuth:TB14-4-387

Don­ald E. Knuth

1996

Knuth:TB17-1-29

Don­ald E. Knuth

1987

Knuth:TB8-1-14

Don­ald Knuth and Pierre MacKay

1981

Knuth:TB2-3-5

Don­ald Knuth

1982

Knuth:TB3-1-10

Don­ald Knuth

1983

Knuth:TB4-2-64

Don­ald Knuth

1986

Knuth:TB7-2-95

Don­ald Knuth

1987

Knuth:TB8-1-6

Don­ald Knuth

1987

Knuth:TB8-1-73

Don­ald Knuth

1987

Knuth:TB8-2-210

Don­ald Knuth

1987

Knuth:TB8-2-217

Don­ald Knuth

1989

Knuth:TB10-1-8

Don­ald Knuth

1989

Knuth:TB10-1-31

Don­ald Knuth

1990

Knuth:TB11-1-13

Don­ald Knuth

1990

Knuth:TB11-2-165

Don­ald Knuth

1990

Knuth:TB11-4-497

Don­ald Knuth

1990

Knuth:TB11-4-499

Don­ald Knuth

1991

Knuth:TB12-2-313

Don­ald Knuth

2010

Knuth:TB31-2-121

Don­ald Knuth

The orig­i­nal data is stored in a Lua ta­ble, hashed by tag. Start­ing with Lua 5.2 each run of Lua gets a dif­fer­ent or­der­ing of such a hash. In older ver­sions, when you looped over a hash, the or­der was un­de­fined, but the same as long as you used the same bi­nary. This had the ad­van­tage that suc­ces­sive runs, some­thing we of­ten have in doc­u­ment pro­cess­ing gave con­sis­tent re­sults. In to­day’s Lua we need to do much more sort­ing of hashes be­fore we loop, es­pe­cially when we save multi--pass data. It is for this rea­son that the xml tree is sorted by hash key by de­fault. That way lookups (es­pe­cially the first of a set) give con­sis­tent out­comes.


Stan­dards

The ren­der­ing of bib­li­o­graphic en­tries is of­ten stan­dard­ized and pre­scribed by the pub­lisher. If you sub­mit an ar­ti­cle to a jour­nal, nor­mally it will be re­for­mat­ted (or even re- keyed) and the ren­der­ing will hap­pen at the pub­lish­ers end. In that case it may not mat­ter how en­tries were ren­dered when writ­ing the pub­li­ca­tion, be­cause the pub­lisher will do it his or her way. This means that most users prob­a­bly will stick to the stan­dard apa rules and for them we pro­vide some con­fig­u­ra­tion. Be­cause we use se­tups it is easy to over­load specifics. If you re­ally want to tweak, best look in the files that deal with it.

Many stan­dards ex­ist and sup­port for other ren­der­ings may be added to the core. In­ter­ested users are in­vited to de­velop and to test al­ter­nate stan­dard ren­der­ings ac­cord­ing to their needs.

Todo: maybe a list of cat­e­gories and fields.


Clean­ing up

Al­though the bibTEX for­mat is rea­son­ably well de­fined, in prac­tice there are many ways to or­ga­nize the data. For in­stance, one can use pre­de­fined string con­stants that get used (ei­ther or not com­bined with other strings) later on. A string can be en­closed in curly braces or dou­ble quotes. The strings can con­tain TEX com­mands but these are not stan­dard­ized. The data­bases of­ten have some­what com­plex ways to deal with spe­cial char­ac­ters and the use of braces in their de­f­i­n­i­tion is also not nor­mal­ized.

The most com­plex to deal with are the fields that con­tain names of peo­ple. At some point it might be needed to split a com­bi­na­tion of names into in­di­vid­ual ones that then get split into ti­tle, first name, op­tional in­be­tweens, sur­name(s) and ad­di­tional: Prof. Dr. Alfred B. C. von Kwik Kwak Jr. II and P. Q. Olet is just one ex­am­ple of this. The con­ven­tion seems to be not to use com­mas but and to sep­a­rate names (of­ten each name will be spec­i­fied as last­name, first­name).

We don’t see it as chal­lenge nor as a duty to sup­port all kinds of messy de­f­i­n­i­tions. Of course we try to be some­what tol­er­ant, but you will be sure to get bet­ter re­sults if you use nicely setup, con­sis­tent data­bases.

Todo: maybe some ex­am­ples of bad.


Tran­si­tion

In the orig­i­nal bib­li­og­ra­phy sup­port mod­ule us­age was as fol­lows (ex­am­ple taken from the con­textgar­den wiki):

% engine=pdftex
\usemodule[bib]
\usemodule[bibltx]
\setupbibtex
  [database=xampl]
\setuppublications
  [numbering=yes]
\starttext
    As \cite [article-full] already indicated, bibtex is a \LATEX||centric
    program.
    \completepublications
\stoptext

For MkIV the mod­ules were partly rewrit­ten and ended up in the core so the two com­mands were no longer needed. The over­head as­so­ci­ated with the au­to­matic load­ing of the bib­li­og­ra­phy macros can be ne­glected these days, so stan­dard­ized mod­ules such as bib are all be­ing moved to the core and do not need to be ex­plic­itly loaded.

The first \setupbibtex com­mand in this ex­am­ple is needed to boot­strap the process: it tells what data­base has to be processed by bibTEX be­tween runs. The sec­ond \setuppublications com­mand is op­tional. Each ci­ta­tion (tagged with \cite) ends up in the list of pub­li­ca­tions.

In the new ap­proach we no longer use bibTEXso we don’t need to setup bibTEX. In­stead we de­fine dataset(s). We also no longer set up pub­li­ca­tions with one com­mand, but have split that up in ren­der­ing-, list-, and cite-vari­ants. The ba­sic \cite com­mand re­mains. The above ex­am­ple be­comes:

\definebtxdataset
  [document]
\usebtxdataset
  [document]
  [mybibfile.bib]
\definebtxrendering
  [document]
\setupbtxrendering
  [document]
  [numbering=yes]
\starttext
    As \cite [article-full] already indicated, bibtex is a \LATEX||centric
    program.
    \completebtxrendering[document]
\stoptext

So, we have a few more com­mands to set up things. If you in­tend to use just a sin­gle dataset and ren­der­ing, the above pre­am­ble can be sim­pli­fied to:

\usebtxdataset
  [mybibfile.bib]
\setupbtxrendering
  [numbering=yes]

But keep in mind that com­pared to the old MkII de­rived method we have moved some of the op­tions to the ren­der­ing, list and cite setup vari­ants.

An­other dif­fer­ence is now the use of lists. When you de­fine a ren­der­ing, you also de­fine a list. How­ever, all en­tries are col­lected in a com­mon list tagged btx. Al­though you will nor­mally con­fig­ure a ren­der­ing you can still set some prop­er­ties of lists, but in that case you need to pre­fix the list iden­ti­fier. In the case of the above ex­am­ple this is btx:document.


ML­BIBTEX

Todo: how to plug in ML­bibTEX for sort­ing and other ad­vanced op­er­a­tions.


Ex­ten­sions

As TEX and Lua are both open and ac­ces­si­ble in ConTEXt it is pos­si­ble to ex­tend the func­tion­al­ity of the bib­li­og­ra­phy re­lated code. For in­stance, you can add ex­tra load­ers.

function publications.loaders.myformat(dataset,filename)
    local t = { }
    -- Load data from 'filename' and convert it to a Lua table 't' with
    -- the key as hash entry and fields conforming the luadata table
    -- format.
    loaders.lua(dataset,t)
end

This then per­mits load­ing a data­base (into a dataset) with the com­mand:

\usebtxdataset[standard][myfile.myformat]

The myformat suf­fix is rec­og­nized au­to­mat­i­cally. If you want to use an­other suf­fix, you can do this:

\usebtxdataset[standard][myformat::myfile.txt]