3.4.4 Special characters


Text encoding

LilyPond uses the character repertoire defined by the Unicode consortium and ISO/IEC 10646. This defines a unique name and code point for the character sets used in virtually all modern languages and many others too. Unicode can be implemented using several different encodings. LilyPond uses the UTF-8 encoding (UTF stands for Unicode Transformation Format) which represents all common Latin characters in one byte, and represents other characters using a variable length format of up to four bytes.

The actual appearance of the characters is determined by the glyphs defined in the particular fonts available – a font defines the mapping of a subset of the Unicode code points to glyphs. LilyPond uses the Pango library to layout and render multi-lingual texts.

LilyPond does not perform any input encoding conversions. This means that any text, be it title, lyric text, or musical instruction containing non-ASCII characters, must be encoded in UTF-8. The easiest way to enter such text is by using a Unicode-aware editor and saving the file with UTF-8 encoding. Most popular modern editors have UTF-8 support, for example, vim, Emacs, jEdit, and Gedit do. All MS Windows systems later than NT use Unicode as their native character encoding, so even Notepad can edit and save a file in UTF-8 format. A more functional alternative for Windows is BabelPad.

If a LilyPond input file containing a non-ASCII character is not saved in UTF-8 format the error message

FT_Get_Glyph_Name () error: invalid argument

will be generated.

Here is an example showing Cyrillic, Hebrew and Portuguese text:

[image of music]


Unicode

To enter a single character for which the Unicode code point is known but which is not available in the editor being used, use either \char ##xhhhh or \char #dddd within a \markup block, where hhhh is the hexadecimal code for the character required and dddd is the corresponding decimal value. Leading zeroes may be omitted, but it is usual to specify all four characters in the hexadecimal representation. (Note that the UTF-8 encoding of the code point should not be used after \char, as UTF-8 encodings contain extra bits indicating the number of octets.) Unicode code charts and a character name index giving the code point in hexadecimal for any character can be found on the Unicode Consortium website, https://www.unicode.org/.

For example, \char ##x03BE and \char #958 would both enter the Unicode U+03BE character, which has the Unicode name “Greek Small Letter Xi”.

Any Unicode code point may be entered in this way and if all special characters are entered in this format it is not necessary to save the input file in UTF-8 format. Of course, a font containing all such encoded characters must be installed and available to LilyPond.

The following example shows Unicode hexadecimal values being entered in four places – in a text mark, as articulation text, in lyrics and as stand-alone text below the score:

\score {
  \relative {
    c''1
    \textMark \markup { \char ##x03A8 }
    c1_\markup { \tiny { \char ##x03B1 " to " \char ##x03C9 } }
  }
  \addlyrics { O \markup { \concat { Ph \char ##x0153 be! } } }
}
\markup { "Copyright 2008--2022" \char ##x00A9 }

[image of music]

To enter the copyright sign in the copyright notice use:

\header {
  copyright = \markup { \char ##x00A9 "2008" }
}

ASCII aliases

A list of ASCII aliases for special characters can be included:

\paper {
  #(include-special-characters)
}

\markup "&flqq; – &OE;uvre incomplète… &frqq;"

\score {
  \new Staff { \repeat unfold 9 a'4 }
  \addlyrics {
    This is al -- so wor -- kin'~in ly -- rics: –_&OE;…
  }
}

\markup \column {
  "The replacement can be disabled:"
  "– &OE; …"
  \override #'(replacement-alist . ()) "– &OE; …"
}

[image of music]

You can also make your own aliases, either globally:

\paper {
  #(add-text-replacements!
    '(("100" . "hundred")
      ("dpi" . "dots per inch")))
}
\markup "A 100 dpi."

[image of music]

or locally:

\markup \replace #'(("100" . "hundred")
                    ("dpi" . "dots per inch")) "A 100 dpi."

[image of music]

The replacement is not necessarily a string; it can be an arbitrary markup. On the syntax level, this requires using Scheme quasi-quoting syntax, with a backtick ‘`’ instead of a quote ‘'’ to write the alist.

\markup \replace
  #`(("2nd" . ,#{ \markup \concat { 2 \super nd } #})) "2nd time"

[image of music]

Aliases themselves are not further processed for replacements.

See also

Notation Reference: List of special characters.

Installed Files: ‘ly/text-replacements.ly’.


LilyPond — Notation Reference v2.23.82 (development-branch).