[ previous ] [ Contents ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ next ]


The Hackers Guide for console-setup
Chapter 2 - bdf2psf


The program bdf2psf translates BDF fonts to PSF format. It accepts fonts with arbitrary size of the font matrix. If the width of matrix of the source font is 7 or 9 pixels then it generates fonts with width of 8 pixels.


2.1 Synopsis

     bdf2psf [--fb][--log LOG] BDF{+BDF} EQUIV{+EQUIV} SYMB{+[:]SYMB} SIZE PSF [SFM]

Description of the options:

--fb

Generate fonts for the framebuffer. There are two important differences between the framebuffer and the text mode. First, all fonts in text mode have to have matrix 8 pixels width. They also have to have either 256 or 512 glyphs. Second, in some text modes the hardware does some magic in order to use 8 pixels width fonts as if they were 9 pixels width. In order to achieve this the video hardware copyes the 8th column in the 9th columnt of the glyphs with codes from 0xC0 to 0xDF and from 0x1C0 to 0x1DF. Bdf2psf is very careful when deciding where to place a particular glyph and as a result the encoding of the generated font is more or less arbitrary.

--log LOG

Record in the file LOG any problems during the conversion.

BDF{+BDF}

The source BDF font(s). When a particular symbol is defined in more than one of the specified fonts then the first listed fonts take precedence.

EQUIV{+EQUIV}

A list of files defining an equivalence relation between the glyphs. See Equivalence files, Section 2.3.

SYMB{+[:]SYMB}

Generate PSF font for the character set described in the file SYMB. If more than one character set is specified the PSF font will support all of them. When there is no space for all character sets, the first in the list take precedence. When a colon before the character set is specified no warnings will be issued for symbols that could not be placed in the font. See Character Sets, Section 2.2.

SIZE

The size of the PSF font. Usually 256 or 512 glyphs.

PSF

PSF is the name of the generated PSF font. If a file with this name already exists it will be overwritten.

SFM

Save in the file SFM the SFM of the generated font. This parameter is optional.


2.2 Character Sets

The encoding of the traditional console fonts follows the standard encoding of the different languages. For example there are fonts for all variants of ISO 8859. This is redundand, for example ISO 8859-1, ISO 8859-9 and ISO 8859-15 differ only by few characters.

In order to determine the minimal set of character sets a clustering algorithm was used. The source code of fontconfig contains lists of the characters that most languages require—one list per language. We started with one character set per language and used the clustering algorithm in order to join the character sets to bigger. The following character sets were the result of the algorithm:

Arabic (512 glyphs)

For Arabic, Kurdish in Iran, Pashto, Persian and Urdu.

Armenian

For Armenian.

CyrAsia

Suitable for some of the non-Slavic Cyrillic languages - Abkhazia, Avaric, Azerbaijani, Bashkir, Buryat, Chechen, Chuvash, Inupiaq (Eskimo), Kara-Kalpak, Kazakh, Kirgiz, Komi, Kumyk, Kurdish, Lezghian, Mari (Cheremis), Mongolian, Ossetic, Selkup (Ostyak- Samoyed), Tajik, Tatar, Turkmen, Tuvinian, Uzbek and Yakut.

CyrKoi

Covers entirely KOI8-R and KOI8-U. Suitable for Russian and Ukrainian.

CyrSlav

Covers entirely ISO-8859-5 and CP1251. Suitable for the Slavic Cyrillic languages - Belarusian, Bulgarian, Macedonian, Russian, Serbian and Ukrainian. For Serbian both the Cyrillic and the Latin alphabets are supported.

Ethiopian (512 glyphs)

For Amharic, Ethiopic (Geez), Tigre and Tigrinya.

Georgian

For Georgian.

Greek

For Greek.

Hebrew

For Hebrew and Yiddish.

Lao

For Lao.

Lat15

Covers entirely ISO-8859-1, ISO-8859-9 and ISO-8859-15. Suitable for the so called Latin1 and Latin5 languages - Afar, Afrikaans, Albanian, Aragonese, Asturian, Aymara, Basque, Bislama, Breton, Catalan, Chamorro, Danish, Dutch, English, Estonian, Faroese, Fijian, Finnish, French, Frisian, Friulian, Galician, German, Hiri Motu, Icelandic, Ido, Indonesian, Interlingua, Interlingue, Italian, Low Saxon, Lule Sami, Luxembourgish, Malagasy, Manx Gaelic, Norwegian Bokmal, Norwegian Nynorsk, Occitan, Oromo or Galla, Portuguese, Rhaeto-Romance (Romansch), Scots Gaelic, Somali, South Sami, Spanish, Swahili, Swedish, Tswana, Turkish, Volapuk, Votic, Walloon, Xhosa, Yapese and Zulu.

Lat2

Covers entirely ISO-8859-2. The Euro sign and the Romanian letters with comma below are also supported. Suitable for the so called Latin2 languages - Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Slovak, Slovenian and Sorbian (lower and upper).

Lat38

Covers entirely ISO-8859-3 and ISO-8859-14. Suitable for Chichewa Esperanto, Irish, Maltese and Welsh.

Lat7

Covers entirely ISO-8859-13. Suitable for Lithuanian, Latvian, Maori and Marshallese.

Thai

For Thai.

Uni1 (512 glyphs)

Supports most of the Latin languages, the Slavic Cyrillic languages, Hebrew and barely Arabic.

Uni2 (512 glyphs)

Supports most of the Latin languages, the Slavic Cyrillic languages and Greek.

Uni3 (512 glyphs)

Supports most of the Latin and Cyrillic languages.

Vietnamese (512 glyphs)

For Vietnamese

These character sets are described in files in the directory Fonts/fontsets. These files list the unicodes of the symbols of the character set, one per line. Comments starting with a sharp sign are also allowed.

There two more special character sets in the files required.set and useful.set. The first of them lists the symbols that every console font is obligated to support. There two classes of obligatory symbols—the ASCII symbols and the symbols from the so called alternate character set (see section "Line Graphics" of terminfo(5)). Notice that in order to limit itself to the cp437 character set, the Linux console driver does some approximations of the symbols from the alternate character set. For example it prints U+256A (BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE) instead of the not-equal sign. The file required.set lists the symbols used by the Linux console driver (i.e. U+256A instead of the not-equal sign).

In most cases there is more available space in the fonts than nacessary. The spare codes are filled with the symbols from the useful.set special character set. On the command line of bdf2psf a colon is used before the name of useful.set so no warnings are issued if there is no space in the font for some of these symbols.


2.3 Equivalence files

The equivalence files define an equivalence relation between unicodes. The sharp sign is used for comments, the empty lines are ignored. All other lines should list two or more unicodes. Only one glyph will be allocated in the PSF font for these unicodes.

Example:

     U+2126 U+03A9
     # U+2126:   OHM SIGN
     # U+03A9:   GREEK CAPITAL LETTER OMEGA
     U+041D U+0048
     # U+041D:   CYRILLIC CAPITAL LETTER EN
     # U+0048:   LATIN CAPITAL LETTER H

This equivalence file says that U+2126 (the Ohm sign) and U+03A9 (Omega) have the same look so only one glyph is enough for them. And also U+041D (Cyrillic En) and U+0048 (Latin H) look the same.

Two equivalence files are used—standard.equivalents and arabic.equivalents. The first is used for all fonts. The second is used only for the fonts with character set Uni1, its purpose is to reduce the number of the necessary glyphs for the Arabic letters at the cost of the font quality.


[ previous ] [ Contents ] [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ next ]


The Hackers Guide for console-setup

Anton Zinoviev anton@lml.bas.bg