Open main menu

Changes

created page
__TOC__

This wiki post outlines a Python script which produces ConTeXt output. The code output crops pages to size at a given ratio and scale. The code might benefit from some tweaks, alternative default behaviour and additional features. Nevertheless, interpolation is put to use in a command line interface and I found it interesting to share for this reason. I plan to produce more ideas for similar algorithms in the near future.

<pre>import argparse
import math
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler</pre>
To set up a command line application the argparse module is used. Argparse allows for flags and positional arguments to be given to the script when executed at the command line. The values passed in by the user are stored in variables. Argparse also implements a help flag which offers information about available flags.

<pre>parser = argparse.ArgumentParser(description='Crop typesetting areas.')</pre>
Arguments are added. <code>outfile</code> is a positonal argument whereas the remaining arguments are flags. The flags will be looked at in more detail later on.

<pre>parser.add_argument('outfile',
metavar='OUTFILE',
nargs=1,
help="Write to a file")
parser.add_argument('--papersize',
metavar='PAPERSIZE',
nargs=1,
default='A4',
help="Provide a standard papersize")
parser.add_argument('--ratio',
metavar='RATIO',
nargs=1,
default='2:3',
help="Crop the paper to this proportion")
parser.add_argument('--orientation',
metavar='ORIENTATION',
nargs=1,
default='portrait',
help="Switch between portrait and landscape.")
parser.add_argument('--scale',
metavar='SCALE',
nargs=1,
default=[90.0],
help="Scale the size of the cropped page.")</pre>
<pre>_StoreAction(option_strings=['--scale'], dest='scale', nargs=1, const=None, default=[90.0], type=None, choices=None, required=False, help='Scale the size of the cropped page.', metavar='SCALE')</pre>
For the sake of example, let’s pass the following arguments to the script.

<pre>args = parser.parse_args(args=['--scale', '90',
'--ratio', '5:3',
'--papersize', 'A3',
'--orientation', 'landscape',
'main.tex'])</pre>
<noinclude>
<span id="wishlist"></span>
==Wishlist==
</noinclude>
<includeonly>
<span id="wishlist"></span>
====Wishlist====
</includeonly>
It would be interesting to add a <code>--page-on-page</code> flag which introduces variation in the output. When active, this flag would print the cropped page on the given page size at the given scale and ratio. This is the default behaviour at the moment. Implementing this flag would result in an alternative default behavior where the output is a page already cropped to size.
<noinclude>
<span id="papersize-dictionary"></span>
==Papersize Dictionary==
</noinclude>
<includeonly>
==== Papersize Dictionary ====
</includeonly>
I drew up a dictionary of A-series papersizes based on information at [https://papersizes.io papersizes.io]. This way paper dimensions can be referenced by name.

<pre>portrait_paper_sizes = {
# size width height (mm)
"A0" : [841, 1189],
"A1" : [594, 841],
"A2" : [420, 594],
"A3" : [297, 420],
"A4" : [210, 297],
"A5" : [148, 210],
"A6" : [105, 148],
"A7" : [74, 105],
"A8" : [52, 74],
"A9" : [37, 52],
"A10": [26, 37],
"A11": [18, 26],
"A12": [13, 18],
"A13": [9, 13],
"2A0": [1189, 1682],
"4A0": [1682, 2378],
"A0+": [914, 1292],
"A1+": [609, 914],
"A3+": [329, 483]
}</pre>
<noinclude>
<span id="portrait-and-landscape"></span>
==Portrait and Landscape==
<noinclude>
<includeonly>
==== Portrait and Landscape ====
</includeonly>
I figured I would implement portrait and landscape orientations into the script. Portrait mode is enabled by default. Passing <code>--orientation landscape</code> to the command switches to landscape output. It might be more concise to have a <code>--landscape</code> flag.

<pre>if "portrait" in args.orientation:
paper_width = portrait_paper_sizes[args.papersize[0]][0]
paper_height = portrait_paper_sizes[args.papersize[0]][1]
print(args.papersize[0], "portrait", paper_width, "mm x", paper_height, "mm")</pre>
I have not accounted for a situation in which someone provides a papersize which is not listed in the dictionary. I expect that at the moment, the script will throw an error if this happens. In any case, it’s necessary to exchange the values of the width and height in landscape mode. This can be done in an least two ways. I decided to change the indexes like so.

<pre>if "landscape" in args.orientation:
paper_width = portrait_paper_sizes[args.papersize[0]][1]
paper_height = portrait_paper_sizes[args.papersize[0]][0]
print(args.papersize[0], "landscape", paper_width, "mm x", paper_height, "mm")</pre>
<pre>A3 landscape 420 mm x 297 mm</pre>
Bu it is also possible to switch the values of <code>paper_width</code> and <code>paper_height</code> by creating a new dictionary of landscape paper sizes. The code commented out below does that.

<pre># if "landscape" in args.orientation:
# landscape_paper_sizes = {}
# for size in portrait_paper_sizes:
# landscape_paper_sizes[size] = portrait_paper_sizes[size][::-1]
# paper_width = landscape_paper_sizes[args.papersize[0]][0]
# paper_height = landscape_paper_sizes[args.papersize[0]][1]
# print(args.papersize[0], "landscape", paper_width, "mm x", paper_height, "mm")</pre>
<noinclude>
<span id="ratio"></span>
==Ratio==
</noinclude>
<includeonly>
====Ratio====
</includeonly>
<pre>ratio = args.ratio[0].split(":")
ratio_x = int(ratio[0])
ratio_y = int(ratio[1])
print(f"Crop ratio: {ratio_x}:{ratio_y}")</pre>
<pre>Crop ratio: 5:3</pre>
The ratio is provided to the script with the <code>--ratio</code> flag. By default the ratio is 2:3. Some calculations need to be done so let’s initialise some variables.

<pre>possible_widths_list = []
possible_heights_list = []
w = ratio_x
h = ratio_y</pre>
In order to ascertain the size of the cropped page, I’m calculating a list of measurements. These measurements indicate towards the 2D area of the cropped page. The values are later used in the context of the scale feature. The following calculation checks the ratio against the dimensions of the page. A <code>for</code> loop is used to provide a limit to the length of the list which contains the measurements described above.

<pre>if (math.floor(paper_width / ratio_y)) > (math.floor(paper_height / ratio_x)):
# If the paper is landscape
for dimension in range(math.floor(paper_width / ratio_x)):
possible_widths_list += [w]
possible_heights_list += [h]
w += ratio_x
h += ratio_y
else:
for dimension in range(math.floor(paper_height / ratio_y)):
possible_widths_list += [w]
possible_heights_list += [h]
w += ratio_x
h += ratio_y</pre>
<noinclude>
<span id="pandas-numpy-and-sklearn"></span>
==Pandas, Numpy and Sklearn==
</noinclude>
<includeonly>
====Pandas, Numpy and SciKit Learn====
</includeonly>
At the beginning of the script, I imported (parts of) these modules into the python script. This was to enable python to make use of different mathematical functions. In particular, I’m going to use a pandas DataFrame, SciKit Learn’s MinMaxScaler and Numpy’s interp function. The purpose is to provide the user with the ability to scale the size of the cropped page in the output. In short, the values in <code>possible_widths_list</code> and <code>possible_heights_list</code> are adjusted to a percentage scale. That there can be more or less than 100 values in the <code>possible_widths_list</code> and <code>possible_heights_list</code> means that the value of the length of the list needs to represents 100%. To begin with, let’s create a DataFrame and a scaler. Some of the code which appears below was adapted from [https://codefellows.github.io/sea-python-401d5/lectures/rescaling_data.html this website].

<pre>df = pd.DataFrame({"widths": possible_widths_list, "heights": possible_heights_list})
scaler = MinMaxScaler()</pre>
<noinclude>
<span id="visualising-the-dataframe"></span>
===Visualising the dataframe===
</noinclude>
<includeonly>
=====Visualising the Dataframe=====
</includeonly>
The dataframe resembles a table of widths and heights spanning a range of values.

<pre>print(df)</pre>
<pre> widths heights
0 5 3
1 10 6
2 15 9
3 20 12
4 25 15
.. ... ...
79 400 240
80 405 243
81 410 246
82 415 249
83 420 252

[84 rows x 2 columns]</pre>
<noinclude>
<span id="adding-scaled-values-to-the-dataframe"></span>
===Adding scaled values to the dataframe===
</noinclude>
<includeonly>
=====Adding scaled values to the dataframe=====
</includeonly>
This code assigns a percentage-based value to each possible width and height.

<pre>tmp_widths = df.widths - df.widths.min()
tmp_heights = df.heights - df.heights.min()
scaled_widths = tmp_widths / tmp_widths.max() * 100
scaled_heights = tmp_heights / tmp_heights.max() * 100

df["scaled_widths"] = scaled_widths
df["scaled_heights"] = scaled_heights

print(df)</pre>
<pre> widths heights scaled_widths scaled_heights
0 5 3 0.000000 0.000000
1 10 6 1.204819 1.204819
2 15 9 2.409639 2.409639
3 20 12 3.614458 3.614458
4 25 15 4.819277 4.819277
.. ... ... ... ...
79 400 240 95.180723 95.180723
80 405 243 96.385542 96.385542
81 410 246 97.590361 97.590361
82 415 249 98.795181 98.795181
83 420 252 100.000000 100.000000

[84 rows x 4 columns]</pre>
<noinclude>
<span id="interpolating-the-values"></span>
===Interpolating the values===
</noinclude>
<includeonly>
=====Interpolating the values=====
</includeonly>
Next, the values are interpolated. To my understanding, this is like cross-referencing the values in one list against the values in another. It’s like creating an array with floating-point indexes. The values in between are interpolated and rounded to the nearest mm. The resulting values are consistently approximate.

<pre>scaled_paper_height = math.floor(np.interp(95.2, scaled_heights, possible_heights_list))
scaled_paper_width = math.floor(np.interp(95.2, scaled_widths, possible_widths_list))

print(scaled_paper_width)
print(scaled_paper_height)</pre>
<pre>400
240</pre>
Notice that the printed values correspond to the scaled values in the DataFrame. It’s best if the user can determine the scale to crop the paper to. So, the first argument to <code>np.interp</code> is replaced with <code>args.scale[0]</code>.

<pre>scaled_paper_height = math.floor(np.interp(args.scale[0], scaled_heights, possible_heights_list))
scaled_paper_width = math.floor(np.interp(args.scale[0], scaled_widths, possible_widths_list))</pre>
<noinclude>
<span id="writing-to-a-file"></span>
==Writing to a file==
</noinclude>
<includeonly>
====Writing to a file====
</includeonly>
The output of the script is code which can be understood by the ConTeXt typesetting software. F-strings containing the values calculated by or provided to the script are used. The variables feature at key points in the ConTeXt code. The file is created. Then, a blank layout is defined and setup.

<pre>f = open(args.outfile[0], "w")</pre>
<pre>f.write("""\\definelayout[blank][
topspace=0mm,
backspace=0mm,
bottomspace=0mm,
width=fit,
height=fit,
header=0mm,
footer=0mm,
leftmargin=0mm,
rightmargin=0mm,
leftmargindistance=0mm,
rightmargindistance=0mm]
\\setuplayout[blank]""")</pre>

Then, having turned off page numbering, the f-string containing the values of <code>scaled_paper_width</code> and <code>scaled_paper_height</code> are passed to <code>\definepapersize</code>.

<pre>f.write(f"""\\definepapersize[scaled][width={scaled_paper_width}mm, height={scaled_paper_height}mm]
\\setuppapersize[scaled]""")</pre>

The code takes landscape mode into account using an if statement

<pre>
if "portrait" in args.orientation:
f.write(f"[{args.papersize[0]}]")
else:
f.write(f"[{args.papersize[0]}, landscape]")</pre>
Finally, the layout is setup, the frame is switched on and the text environment is invoked. Inside the text environment, a frame which fills the typesetting area is included to ensure there is content in the document.

<pre>
f.write("""\\setuplayout[location=""" "{middle,middle}" """,marking=empty]
\\showframe
\\starttext
\\startframedtext[width=\\textwidth,height=\\textheight]



\\stopframedtext
\\stoptext
""")</pre>

<pre>
f.close()</pre>
<noinclude>
<span id="pdf-output"></span>
==PDF Output==
</noinclude>
<includeonly>
====PDF Output====
</includeonly>

ConTeXt can be run on the output file, in this case <code>main.tex</code>, to produce a pdf.
2

edits