Changes

Jump to navigation Jump to search
3,793 bytes added ,  22:16, 4 February 2009
created the page
= The Problem =

We have a situation where hyphenation is an issue, due to a 2-column
layout where the columns are not very wide. We've done a lot of tweaking
of settings for hyphenation and interword spacing, and the result seems
pretty good. In particular, there are not many cases of consecutive
lines that end with hyphens, and not many cases where a hyphenation
occurs over a right-hand page break. The few cases that exist, we have
been fixing manually by using <code>\hbox{...}</code> to prevent hyphenation at the
trouble spot.

But the hyphenation is by nature somewhat volatile, so whenever we
change something we would like to be able to easily recheck the hyphenation.
And our book is over 1200 pages, so it would be very helpful to have
tools to make the checking more efficient.

= Potential Solutions =

== PDF viewers ==

One tool we found was the "evince" PDF viewer in Linux, which highlights
all search results at once. So you can search for "-", and it will
highlight all hyphens, which makes it easier to scan the PDF visually
for hyphenation problems.

Still, this approach has its limitations... our layout domain experts
don't have Linux machines, and I haven't found a PDF viewer for Windows
that can highlight all search results at once.

(We are still looking into Okular, which is available for Windows at http://windows.kde.org)

== A ConTeXt solution ==
Another approach we wondered about was having TeX highlight the
hyphenations... e.g. changed the background color to yellow or red, when
outputting a word that's dynamically broken/hyphenated. (Rather like we
have TeX output red grid lines to help with debugging layout.)
I think we would also want to highlight static hyphens that occur at the
end of a line, as in "Niger-
Congo," because they have a similar visual impact. Possibly using a
different color.

This would be an ideal solution, I think, but we don't know how to have
TeX detect when a word gets dynamically hyphenated. (I made some inquiries on the NTG list to this effect. The response was that it would be not difficult to implement this in mkiv, but it could not be done in mkii. And we were not free to move to mkiv.)

== Adobe Acrobat / Javascript ==

Another possibility is using javascript in Adobe
Acrobat Pro to automatically find and highlight end-of-line (and end-of-page) hyphens.
That is the approach where we had most success. The features and limitations are described below, and the javascript code is attached.

=== Features ===
* In Acrobat Pro, load a PDF and select "Highlight Hyphens" from the Tools menu to begin the highlighting. The first part of each word that is line-broken with a hyphen is highlighted.
* The javascript console window shows progress.
* The console reports number of hyphens (actually, words line-broken with a hyphen) on each page.
* The resulting highlighted PDF can be saved including the highlights.
* The saved, highlighted PDF can be viewed with highlights using Adobe Reader (does not require Acrobat Pro).

=== Limitations ===
* Slow. A representative test showed 0.07 pages per second (14 seconds per page!) That would mean about 5 hours for our book.
* The resulting PDF file grows by about 25%.
* Sometimes the highlighting function stops with an error ("Internal error" / "General error") after about 30 pages. We don't know why but maybe it could be avoided by only doing a limited number of pages at a time.

=== The code ===
The two attached javascript files are placed in the Acrobat javascripts folder, e.g. C:\Program Files\Adobe\Acrobat 9.0\Acrobat\Javascripts, and then Acrobat is restarted.
add-hyphen-menu.js adds a menu item for "Highlight Hyphens..." on the Tools menu.
findAndAnnot.js defines the function that finds line-broken words and highlights the first "quad" of each.
53

edits

Navigation menu