Fonts in LaTeX, Part Three: pdfTeX and TrueType

Update: The information in this post is out of date: otftotfm does presently have support for TrueType outlines. See my errata post for more information.

In the previous part of this tutorial, I explained how to put together the minimal infrastructure needed to use an OpenType font with pdfLaTeX.  However, I used the tool otftotfm to generate the font metrics TeX needs to lay out text. However, otftotfm only supports OpenType fonts that use PostScript font outlines, as opposed to TrueType font outlines. So in this part of the tutorial I will explain how to put together the necessary infrastructure for TrueType fonts. In preparation for that, we will first make a few changes to what we had done earlier.

For those that would find it useful, I've put together a  zip file containing all the files from the tutorials (except the fonts, which I don't want to deal with distributing).

Firstly, we are going to move the uses of \DeclareUnicodeCharacter out of UPagella.fd and into uenc.def:

  1.  
  2. \ProvidesFile{uenc.def}
  3. % We are declaring an encoding named "U"
  4. \DeclareFontEncoding{U}{}{}
  5.  
  6. % Technically these are not "allowed" in .def files,
  7. % but this is really the logical place to put the
  8. % declarations.
  9.  
  10. % τ (0x03C4) maps to 0xF8 in the encoding
  11. \DeclareUnicodeCharacter{03C4}{\char"F8}
  12. % ε (0x03B5) maps to 0xF9 in the encoding
  13. \DeclareUnicodeCharacter{03B5}{\char"F9}
  14. % χ (0x03C7) maps to 0xFA in the encoding
  15. \DeclareUnicodeCharacter{03C7}{\char"FA}

As I mention in the comments, the documentation on font encoding definition files does not list \DeclareUnicodeCharacter to be one of the allowed declarations in a such a file, but it works, and it seems like the more logical place to configure it than in the font definition file.

Now that we have removed the uses of \DeclareUnicodeCharacter from UPagella.fd, it looks like:

  1.  
  2. \ProvidesFile{UPagella.fd}
  3.  
  4. % Delcaring a font family called "Pagella" for the encoding "U"
  5. \DeclareFontFamily{U}{Pagella}{}
  6.  
  7. % Declare that font family "Pagella", for encoding "U", has a shape
  8. % with weight medium (m) and normal (n) slant (in otherwords, upright)
  9. \DeclareFontShape{U}{Pagella}{m}{n}{
  10. % For all sizes...
  11. <->
  12. % ... use the font named
  13. TeXGyrePagella-Regular--custom--base
  14. }{}

I am going to use Deja Vu Sans as the example TrueType font. Fortunately, if you followed everything from the second part of the tutorial, there is not much that needs to be done.

First, we need to generate metrics for Deja Vu Sans. As before, if you are using TeX Live, you'll have the necessary program:

% ttf2tfm DejaVuSans.ttf -q -T custom
ttf2tfm: WARNING: Cannot find character `compwordmark'
         specified in input encoding.
...
...
ttf2tfm: WARNING: Cannot find character `zdotaccent'
         specified in input encoding.
DejaVuSans   DejaVuSans.ttf Encoding=custom.enc

The program ttf2tfm is kind of unusual in that it first takes the filename argument and then all the options. So we've passed it the TrueType font we want to generate metrics for, DejaVuSans.ttf, the option -q to tell it not to print quite so much information, and the option -T custom which tells it to use the encoding defined in the file custom.enc we created in previous part.

Unlike otftotfm, ttf2tfm does not generate an entry that we could use in our map file, custom.map, so we need to write one ourselves. You will want to start with the map we generated by otftotfm for Tex Gyre Pagella, and you will want to add the line:

DejaVuSans <custom.enc <DejaVuSans.ttf

This says to map the TeX font name DejaVuSans to the file DejaVuSans.ttf using the encoding custom.enc. To learn more about the format of map files, there is a section on them in the pdfTeX manual.

Now we just need to create a font definition file for Deja Vu Sans. However, it is essentially the same as the one we created for TeX Gyre Pagella:

  1.  
  2. \ProvidesFile{UDejaVuSans.fd}
  3.  
  4. % Delcaring a font family called "DejaVuSans" for the encoding "U"
  5. \DeclareFontFamily{U}{DejaVuSans}{}
  6.  
  7. % Declare that font family "DejaVuSans", for encoding "U", has a shape
  8. % with weight medium (m) and normal (n) slant (in otherwords, upright)
  9. \DeclareFontShape{U}{DejaVuSans}{m}{n}{
  10. % For all sizes...
  11. <->
  12. % ... use the font named
  13. DejaVuSans
  14. }{}

We have just replaced all occurrences of Pagella with DejaVuSans.

Finally, we just need to update our example document to use Deja Vu Sans:

  1.  
  2. \documentclass{article}
  3. \usepackage[utf8]{inputenc}
  4. \usepackage[U]{fontenc}
  5. \pdfmapfile{+custom.map}
  6. \renewcommand{\rmdefault}{Pagella}
  7. \renewcommand{\sfdefault}{DejaVuSans}
  8.  
  9. \begin{document}
  10. Testing pdfLaTeX!
  11.  
  12. Greek: τεχ.
  13.  
  14. \begin{sffamily}
  15. Testing pdfLaTeX!
  16.  
  17. Greek: τεχ.
  18. \end{sffamily}
  19. \end{document}

Here we have used \renewcommand to set the default sans serif font, \sfdefault, to be DejaVuSans. In the body of the document, we've copied the text and surrounded it with the sffamily environment to have it typeset in sans serif.

Now we have everything we need to run pdflatex:

% pdflatex test-pdflatex.tex
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
 %&-line parsing enabled.
...
...
(./test-pdflatex.aux) (./upagella.fd) (./udejavusans.fd) [1]
(./test-pdflatex.aux) ){custom.enc}{a_qnnnfc.enc}<./TeXGyrePage
lla-Regular.pfb>
Output written on test-pdflatex.pdf (1 page, 34857 bytes).
Transcript written on test-pdflatex.log.

And we have the desired output:

Testing pdfLaTeX with both OpenType and TrueType fonts

And that's everything you need to get started with TrueType fonts and pdfLaTeX. Again, if you encounter any problems or notice any omissions, let me kow. I'll do some investigation and there will possibly be a fourth part on using fontinst.

Comments (3)

Fonts in LaTeX, an intermission

Part one of my tutorial attracted a considerable number of visitors, far more than any single entry in the past, partly because it was posted to reddit.

Looking at the comments on reddit, I figured that I would say that luatex does resolve pdfTeX's internal limitation of 256 glyphs that I mentioned in part two, and it should directly support OpenType fonts with PostScript outlines.

However, my understanding is that the authors of luatex do not intend to make using TrueType and OpenType fonts as simple as XeTeX directly.  Instead, luatex merely makes the machinery available for someone else to build upon.  So someone will need to write a LaTeX package for luatex to put it all together, and as far as I know, no one has done this yet (let me know if I'm wrong!).  Also, while the plan is for luatex to eventually be merged back into pdfTeX, I think it is an overstatement to say that it will happen "soon".  The current luatex roadmap says that a "production" ready version will be available in August 2009.  I doubt that the merge back to pdfTeX will happen any sooner than 2010 given that.  But yes, in the long term I think luatex will be a great thing.

It also sounds like I should probably write a fourth part to my tutorial on using fontinst.  I've never personally used it myself, and when I first started working with OpenType fonts and LaTeX I wasn't aware of its existence.  Therefore, I wrote otftofd.  So it might take a bit longer to write as I will have to learn it at the same time.

Comments (1)

Fonts in LaTeX, Part Two: pdfTeX and OpenType

In part one of the tutorial, I commented that sometimes you would want to use pdfTeX and pdfLaTeX instead of XeTeX and XeLaTeX. One reason to consider using pdfTeX over XeTeX is that the latter does not yet support the same microtypographic features. When you are preparing slides, pdfTeX's microtypographic features probably will not have much of an impact on your output, but I've definitely found that while preparing articles and my dissertation, using pdfTeX's microtypographic features produces much nicer looking output with fewer bad breaks or hyphenations.

However, pdfTeX's architecture for handling fonts is much more like standard TeX and is far more complicated that XeTeX's. One option is to use a tool to do all the work for you.  For example, you could use the fontinst utility or my own tool, otftofd.  The other option is to do it all by hand, which is what I will explain in this tutorial.

One of the first complications you'll encounter with pdfTeX is that the font that is active at a given time can only refer to 256 glyphs at a time. Therefore if you need to use more than 256 different glyphs in a document, you will need to switch between multiple "fonts".

The first step in using a font in pdfTeX is picking an encoding. Since most OpenType fonts contain more than 256 glyphs, an encoding provides a mapping from those glyphs to the 256 that you can reference at a given time in pdfTeX.

For the most part I generally just use what is called the T1 or "Cork" encoding. However, if we want to replicate the example from the first part of the tutorial, we will need to make a custom encoding to access the Greek glyphs. So, first use kpsewhich to find where your system keeps cork.enc, and make a copy:

% kpsewhich cork.enc
/local/texlive/2007/texmf-dist/fonts/enc/dvips/base/cork.enc
% cp /local/texlive/2007/texmf-dist/fonts/enc/dvips/base/cork.enc ./custom.enc

Open custom.enc in you favorite editor, and go to the end. Assuming you are using the same version of TeX Live as me, the last few lines will look something like:

/oslash /ugrave /uacute /ucircumflex /udieresis /yacute /thorn /germandbls
] def

You will want to edit it to look like:

/tau /epsilon /chi /ucircumflex /udieresis /yacute /thorn /germandbls
] def

What we have done is changed the encoding so that glyphs 0xf8, 0xf9, 0xfa (in hexadecimal) now point to τ, ε, and χ. The general format of entries in the encoding file is / followed by a name. In the case that the software doesn't understand a name that you think it should, you can always specify the gylph using its Unicode hexadecimal name prefixed with /uni. For example, we could have changed the encoding as followings:

/uni03c4 /uni03b5 /uni03c7 /ucircumflex /udieresis /yacute /thorn /germandbls
] def

A complete list of glyph names can be obtained from Adobe's website.  You can learn more about the encoding file format from the dvips documentation, though the eagle-eye may have noticed that actually a subset of PostScript itself.

Next, we need to create a file to tell LaTeX about our new encoding, which we will call U for "user-defined". Create a file in the current directory called uenc.def and put the following in it:

  1.  
  2. \ProvidesFile{uenc.def}
  3. \DeclareFontEncoding{U}{}{}

As it says, it is defining a new font encoding called "U".

Now that we have an encoding, we need to generate font metrics that pdfTeX can understand, and a mapping file to tell it how to map font names to encodings and actual font files. Additionally, pdfTeX (at least last I checked) cannot handle OpenType fonts that contain PostScript rather than TrueType font outlines. So we also need to convert our OpenType font, Pagella, to Type1 format. Fortunately, Eddie Kohler's excellent tool otftotfm will do most that for us. Again, it is included with TeX Live. We invoke it on the font we wish to use, with the encoding we have created, and redirect the output to a file called custom.map:

% otftotfm -e custom.enc texgyrepagella-regular.otf > custom.map
otftotfm: ./custom.enc:19: warning: 'space' has no encoding, ignoring ligature
otftotfm: ./custom.enc:19: warning: 'space' has no encoding, ignoring ligature
otftotfm: ./custom.enc:30: warning: 'space' has no encoding, ignoring '{}'
otftotfm: ./custom.enc:30: warning: 'space' has no encoding, ignoring '{}'
I had to round some heights by 13.0000000 units.
I had to round some depths by 3.0000000 units.
I had to round some heights by 13.0000000 units.
I had to round some depths by 3.0000000 units.

Don't be concerned about the warnings. The first few are just complaints because there is no "space" gylph, which is not used by TeX. The rounding warnings occur, I assume, because PostScript metrics differ very slightly from TeX's internal representation of size metrics. An otftotfm unit is about one thousandth of an em.

We now have have several new files in the current directory:

a_qnnnfc.enc
custom.map
TeXGyrePagella-Regular--custom--base.tfm
TeXGyrePagella-Regular--custom.tfm
TeXGyrePagella-Regular--custom.vf
TeXGyrePagella-Regular.pfb

The pfb file is the PostScript Type 1 version of our original OpenType font, the file custom.map is used to tell pdfTeX how to map a font name to files, the two tfm provide the font metric information TeX needs to format text, the vf file is a "virtual font" file that depending on the options you gave to otftotfm may perform some operations on the basic glyphs, and the file a_qnnnfc.enc is an encoding otftotfm generated based upon the encoding we supplied it. Depending on the options, otftotfm may try to include some additional glyphs to deal with ligatures or in the case that a glyph in the encoding we specified doesn't exist in the font, it will replace its entry with /.notdef, etc.

Next we want to take a peek inside of custom.map. It's contents will look something like the following:

TeXGyrePagella-Regular--custom--base TeXGyrePagella-Regular "AutoEnc_qnnnfca3qut7llkesqq3eddyzc ReEncodeFont" <[a_qnnnfc.enc

You can get away without understanding the structure of the map file, but we need know the name LaTeX should use to refer to the font. In this case it is the somewhat lengthy TeXGyrePagella-Regular--custom--base. We could edit custom.map to give it a different name, but then we would need to make sure to rename the tfm files appropriately. So we'll just leave it alone.

At this point we are ready to describe the font to LaTeX. To to this we'll create a file called UPagella.fd where fd stands for "font definition". Assuming you are using TeX Live, you can learn more about the format of font definition files by running: texdoc fntguide, which will bring up the LATEX 2ε font selection document. Put the following into UPagella.fd:

  1.  
  2. \ProvidesFile{UPagella.fd}
  3. \DeclareFontFamily{U}{Pagella}{}
  4. \DeclareFontShape{U}{Pagella}{m}{n}{ <-> TeXGyrePagella-Regular--custom--base }{}
  5.  
  6. \DeclareUnicodeCharacter{03C4}{\char"F8}
  7. \DeclareUnicodeCharacter{03B5}{\char"F9}
  8. \DeclareUnicodeCharacter{03C7}{\char"FA}
  9.  

The second line declares for the font encoding U, a font family named Pagella. The third line defines an available shape for the Pagella family. It has a medium weight (m) and normal/upright (n), and for all sizes (<->) the font named TeXGyrePagella-Regular--custom--base should be used. The three \DeclareUnicodeCharacter lines map the Unicode glyphs for τ, ε, and χ to their locations in the encoding we defined. Note that the hexadecimal numbers must all be in uppercase for LaTeX to parse them correctly.

Now we are all set to revisit our original example. In test.tex enter:

  1.  
  2. \documentclass{article}
  3. \usepackage[utf8]{inputenc}
  4. \usepackage[U]{fontenc}
  5. \pdfmapfile{+custom.map}
  6. \renewcommand{\rmdefault}{Pagella}
  7.  
  8. \begin{document}
  9. Testing pdfLaTeX!
  10.  
  11. Greek: τεχ.
  12. \end{document}
  13.  

The second line here tells LaTeX to load the inputenc package and pass it the option utf8 to tell it to parse the remainder of the input as UTF8 encoded text. The third line tells LaTeX to load the fontenc package and pass it the option U telling it to set the default encoding to be U. The fourth line is specific to pdfTeX and tells it to add to its internal mapping the definitions in custom.map. Finally, \renewcommand is used to change the default serif (Roman, rm) font to be Pagella.

We can now go ahead and run pdflatex:

% pdflatex test.tex
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
 %&-line parsing enabled.
entering extended mode
(./test.tex
...
...
(./test.aux) (./upagella.fd) [1] (./test.aux) ){a_qnnnfc.enc}<./TeXGyrePagella-
Regular.pfb>
Output written on test.pdf (1 page, 22850 bytes).
Transcript written on test.log.

Again, we now get a PDF with the desired output:
pdfLaTeX test

That concludes the second part of the tutorial. The third, and probably final, part of the tutorial will cover what needs to change in the above process if you would like to use a TrueType font rather than an OpenType font containing PostScript outline data.

Comments (8)