Fonts in LaTeX, Part Two: pdfTeX and OpenType

In part one of the tutorial, I commented that sometimes you would want to use pdfTeX and pdfLaTeX instead of XeTeX and XeLaTeX. One reason to consider using pdfTeX over XeTeX is that the latter does not yet support the same microtypographic features. When you are preparing slides, pdfTeX's microtypographic features probably will not have much of an impact on your output, but I've definitely found that while preparing articles and my dissertation, using pdfTeX's microtypographic features produces much nicer looking output with fewer bad breaks or hyphenations.

However, pdfTeX's architecture for handling fonts is much more like standard TeX and is far more complicated that XeTeX's. One option is to use a tool to do all the work for you.  For example, you could use the fontinst utility or my own tool, otftofd.  The other option is to do it all by hand, which is what I will explain in this tutorial.

One of the first complications you'll encounter with pdfTeX is that the font that is active at a given time can only refer to 256 glyphs at a time. Therefore if you need to use more than 256 different glyphs in a document, you will need to switch between multiple "fonts".

The first step in using a font in pdfTeX is picking an encoding. Since most OpenType fonts contain more than 256 glyphs, an encoding provides a mapping from those glyphs to the 256 that you can reference at a given time in pdfTeX.

For the most part I generally just use what is called the T1 or "Cork" encoding. However, if we want to replicate the example from the first part of the tutorial, we will need to make a custom encoding to access the Greek glyphs. So, first use kpsewhich to find where your system keeps cork.enc, and make a copy:

% kpsewhich cork.enc
/local/texlive/2007/texmf-dist/fonts/enc/dvips/base/cork.enc
% cp /local/texlive/2007/texmf-dist/fonts/enc/dvips/base/cork.enc ./custom.enc

Open custom.enc in you favorite editor, and go to the end. Assuming you are using the same version of TeX Live as me, the last few lines will look something like:

/oslash /ugrave /uacute /ucircumflex /udieresis /yacute /thorn /germandbls
] def

You will want to edit it to look like:

/tau /epsilon /chi /ucircumflex /udieresis /yacute /thorn /germandbls
] def

What we have done is changed the encoding so that glyphs 0xf8, 0xf9, 0xfa (in hexadecimal) now point to τ, ε, and χ. The general format of entries in the encoding file is / followed by a name. In the case that the software doesn't understand a name that you think it should, you can always specify the gylph using its Unicode hexadecimal name prefixed with /uni. For example, we could have changed the encoding as followings:

/uni03c4 /uni03b5 /uni03c7 /ucircumflex /udieresis /yacute /thorn /germandbls
] def

A complete list of glyph names can be obtained from Adobe's website.  You can learn more about the encoding file format from the dvips documentation, though the eagle-eye may have noticed that actually a subset of PostScript itself.

Next, we need to create a file to tell LaTeX about our new encoding, which we will call U for "user-defined". Create a file in the current directory called uenc.def and put the following in it:

  1.  
  2. \ProvidesFile{uenc.def}
  3. \DeclareFontEncoding{U}{}{}

As it says, it is defining a new font encoding called "U".

Now that we have an encoding, we need to generate font metrics that pdfTeX can understand, and a mapping file to tell it how to map font names to encodings and actual font files. Additionally, pdfTeX (at least last I checked) cannot handle OpenType fonts that contain PostScript rather than TrueType font outlines. So we also need to convert our OpenType font, Pagella, to Type1 format. Fortunately, Eddie Kohler's excellent tool otftotfm will do most that for us. Again, it is included with TeX Live. We invoke it on the font we wish to use, with the encoding we have created, and redirect the output to a file called custom.map:

% otftotfm -e custom.enc texgyrepagella-regular.otf > custom.map
otftotfm: ./custom.enc:19: warning: 'space' has no encoding, ignoring ligature
otftotfm: ./custom.enc:19: warning: 'space' has no encoding, ignoring ligature
otftotfm: ./custom.enc:30: warning: 'space' has no encoding, ignoring '{}'
otftotfm: ./custom.enc:30: warning: 'space' has no encoding, ignoring '{}'
I had to round some heights by 13.0000000 units.
I had to round some depths by 3.0000000 units.
I had to round some heights by 13.0000000 units.
I had to round some depths by 3.0000000 units.

Don't be concerned about the warnings. The first few are just complaints because there is no "space" gylph, which is not used by TeX. The rounding warnings occur, I assume, because PostScript metrics differ very slightly from TeX's internal representation of size metrics. An otftotfm unit is about one thousandth of an em.

We now have have several new files in the current directory:

a_qnnnfc.enc
custom.map
TeXGyrePagella-Regular--custom--base.tfm
TeXGyrePagella-Regular--custom.tfm
TeXGyrePagella-Regular--custom.vf
TeXGyrePagella-Regular.pfb

The pfb file is the PostScript Type 1 version of our original OpenType font, the file custom.map is used to tell pdfTeX how to map a font name to files, the two tfm provide the font metric information TeX needs to format text, the vf file is a "virtual font" file that depending on the options you gave to otftotfm may perform some operations on the basic glyphs, and the file a_qnnnfc.enc is an encoding otftotfm generated based upon the encoding we supplied it. Depending on the options, otftotfm may try to include some additional glyphs to deal with ligatures or in the case that a glyph in the encoding we specified doesn't exist in the font, it will replace its entry with /.notdef, etc.

Next we want to take a peek inside of custom.map. It's contents will look something like the following:

TeXGyrePagella-Regular--custom--base TeXGyrePagella-Regular "AutoEnc_qnnnfca3qut7llkesqq3eddyzc ReEncodeFont" <[a_qnnnfc.enc

You can get away without understanding the structure of the map file, but we need know the name LaTeX should use to refer to the font. In this case it is the somewhat lengthy TeXGyrePagella-Regular--custom--base. We could edit custom.map to give it a different name, but then we would need to make sure to rename the tfm files appropriately. So we'll just leave it alone.

At this point we are ready to describe the font to LaTeX. To to this we'll create a file called UPagella.fd where fd stands for "font definition". Assuming you are using TeX Live, you can learn more about the format of font definition files by running: texdoc fntguide, which will bring up the LATEX 2ε font selection document. Put the following into UPagella.fd:

  1.  
  2. \ProvidesFile{UPagella.fd}
  3. \DeclareFontFamily{U}{Pagella}{}
  4. \DeclareFontShape{U}{Pagella}{m}{n}{ <-> TeXGyrePagella-Regular--custom--base }{}
  5.  
  6. \DeclareUnicodeCharacter{03C4}{\char"F8}
  7. \DeclareUnicodeCharacter{03B5}{\char"F9}
  8. \DeclareUnicodeCharacter{03C7}{\char"FA}
  9.  

The second line declares for the font encoding U, a font family named Pagella. The third line defines an available shape for the Pagella family. It has a medium weight (m) and normal/upright (n), and for all sizes (<->) the font named TeXGyrePagella-Regular--custom--base should be used. The three \DeclareUnicodeCharacter lines map the Unicode glyphs for τ, ε, and χ to their locations in the encoding we defined. Note that the hexadecimal numbers must all be in uppercase for LaTeX to parse them correctly.

Now we are all set to revisit our original example. In test.tex enter:

  1.  
  2. \documentclass{article}
  3. \usepackage[utf8]{inputenc}
  4. \usepackage[U]{fontenc}
  5. \pdfmapfile{+custom.map}
  6. \renewcommand{\rmdefault}{Pagella}
  7.  
  8. \begin{document}
  9. Testing pdfLaTeX!
  10.  
  11. Greek: τεχ.
  12. \end{document}
  13.  

The second line here tells LaTeX to load the inputenc package and pass it the option utf8 to tell it to parse the remainder of the input as UTF8 encoded text. The third line tells LaTeX to load the fontenc package and pass it the option U telling it to set the default encoding to be U. The fourth line is specific to pdfTeX and tells it to add to its internal mapping the definitions in custom.map. Finally, \renewcommand is used to change the default serif (Roman, rm) font to be Pagella.

We can now go ahead and run pdflatex:

% pdflatex test.tex
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
 %&-line parsing enabled.
entering extended mode
(./test.tex
...
...
(./test.aux) (./upagella.fd) [1] (./test.aux) ){a_qnnnfc.enc}<./TeXGyrePagella-
Regular.pfb>
Output written on test.pdf (1 page, 22850 bytes).
Transcript written on test.log.

Again, we now get a PDF with the desired output:
pdfLaTeX test

That concludes the second part of the tutorial. The third, and probably final, part of the tutorial will cover what needs to change in the above process if you would like to use a TrueType font rather than an OpenType font containing PostScript outline data.

9 Comments »

  1. Jim said,

    July 12, 2008 @ 8:19 pm

    Thanks for these tutorials, but what would really be a boon to us poor TeX using graduate students would be a tutorial on how to add Chinese/Japanese/(Korean) fonts to TeX.

    Cheers!

  2. washburn said,

    July 12, 2008 @ 9:14 pm

    @Jim: Well, it depends on your needs.

    In my dissertation, I only used Chinese to render the names of a few of my colleagues, so it was not too much work to put together a pdfTeX configuration much in the way I described for those few glyphs.

    If you really need to typeset large numbers of Chinese/Japanese/Korean glyphs, using XeTeX is probably the least painful solution for now, and I’m not sure how much microtypography really matters for improving the typesetting of those languages anyway. If you’re using XeTeX, it is pretty much as simple as having an appropriate font installed and being able to easily enter the glyphs in UTF8.

    If that is still not clear enough, let me know and I can go into more detail.

  3. RiderLemur said,

    July 24, 2008 @ 7:24 am

    That’s some tutorial. So does anyone know of a computer typesetting system that supports Unicode fonts and *isn’t* based on TeX?

  4. washburn said,

    July 24, 2008 @ 8:30 am

    @RiderLemur: There may be some fairly experimental systems, like Platypus, but it might be better to ask yourself why you don’t want to use TeX, because there are really no other systems that even come close in terms of power and quality of typeset output. Knowing why you don’t want to use TeX would help guide my suggestions.

    If you don’t need to typeset mathematics and don’t need to write complicated macros, maybe you should consider InDesign.

    If the problem is that what I’ve written in this part of the tutorial seems excessively complicated, it may be better to look into XeLaTeX.

    If you still need microtypography support and/or find TeX’s macro language abysmal, I would recommend investigating luatex. It doesn’t yet have macro packages that will completely automate the use of TrueType and OpenType fonts out-of-the-box yet, but now that a non-beta release is available, people will probably write such things fairly soon.

  5. Mirza Tayyab said,

    December 6, 2008 @ 10:55 pm

    I am having a problem while generating PDF. Some of the fonts are not embedded. I want to know that how can I neglect(not use) those fonts while generating PDF.

  6. washburn said,

    December 7, 2008 @ 3:05 am

    @Mirza: Can you be more specific? Are all the fonts that are actually used in your document embedded, or are there some that are missing that it is relying on the operating environment to supply? Does the document display correctly?

  7. Eddie Kohler said,

    September 7, 2010 @ 5:41 pm

    Hi, nice tutorial. There is one line I would change, however.

    \DeclareFontShape{U}{Pagella}{m}{n}{ <-> TeXGyrePagella-Regular–custom–base }{}

    I’d leave off the “–base”. Your font had some typographic features that are too complex for a single TeX font to support directly, so otftotfm generated a “virtual font,” TeXGyrePagella-Regular–custom, that refers to a “base font” containing most of the glyphs. If you refer to the “–base” font in LaTeX rather than the “virtual font,” you won’t get those typographic features.

  8. John Doe said,

    December 22, 2013 @ 5:50 pm

    Hi, great article. Just a few words about file cork.enc referenced at the beginning. It doesn’t exist on recent Debian systems (maybe it did but I wouldn’t know for sure). Instead there is a file ec.enc (located in /usr/share/texlive/texmf-dist/fonts/enc/dvips/base/ec.enc)

    The first few lines are:

    1 % @@psencodingfile@{
    2 % date = “24feb10”,
    3 % filename = “ec.enc”,
    4 % email = “tex-fonts@@tug.org”,
    5 % docstring = “This is the EC (aka Cork aka T1) encoding vector
    6 % for 8-bit fonts to be used with TeX.”
    7 % @}

  9. Reuben Thomas said,

    July 31, 2016 @ 3:38 pm

    Many thanks for this, I managed to get it to work for a Greek font with the LGR encoding.

RSS feed for comments on this post · TrackBack URI

Leave a Comment