[ << General input and output ] | [Top][Contents][Index] | [ Spacing issues >> ] |
[ < Substitution function examples ] | [ Up : Working with input files ] | [ Text encoding > ] |
3.4.4 Special characters
Text encoding | ||
Unicode | ||
ASCII aliases |
[ << General input and output ] | [Top][Contents][Index] | [ Spacing issues >> ] |
[ < Special characters ] | [ Up : Special characters ] | [ Unicode > ] |
Text encoding
LilyPond uses the character repertoire defined by the Unicode consortium and ISO/IEC 10646. This defines a unique name and code point for the character sets used in virtually all modern languages and many others too. Unicode can be implemented using several different encodings. LilyPond uses the UTF-8 encoding (UTF stands for Unicode Transformation Format) which represents all common Latin characters in one byte, and represents other characters using a variable length format of up to four bytes.
The actual appearance of the characters is determined by the glyphs defined in the particular fonts available – a font defines the mapping of a subset of the Unicode code points to glyphs. LilyPond uses the Pango library to layout and render multi-lingual texts.
LilyPond does not perform any input encoding conversions. This means that any text, be it title, lyric text, or musical instruction containing non-ASCII characters, must be encoded in UTF-8. The easiest way to enter such text is by using a Unicode-aware editor and saving the file with UTF-8 encoding. Most popular modern editors have UTF-8 support, for example, vim, Emacs, jEdit, and Gedit do. All MS Windows systems later than NT use Unicode as their native character encoding, so even Notepad can edit and save a file in UTF-8 format. A more functional alternative for Windows is BabelPad.
If a LilyPond input file containing a non-ASCII character is not saved in UTF-8 format the error message
FT_Get_Glyph_Name () error: invalid argument
will be generated.
Here is an example showing Cyrillic, Hebrew and Portuguese text:
[ << General input and output ] | [Top][Contents][Index] | [ Spacing issues >> ] |
[ < Text encoding ] | [ Up : Special characters ] | [ ASCII aliases > ] |
Unicode
To enter a single character for which the Unicode code point is
known but which is not available in the editor being used, use
either \char ##xhhhh
or \char #dddd
within a
\markup
block, where hhhh
is the hexadecimal code for
the character required and dddd
is the corresponding decimal
value. Leading zeroes may be omitted, but it is usual to specify
all four characters in the hexadecimal representation. (Note that
the UTF-8 encoding of the code point should not be used
after \char
, as UTF-8 encodings contain extra bits indicating
the number of octets.) Unicode code charts and a character name
index giving the code point in hexadecimal for any character can be
found on the Unicode Consortium website,
https://www.unicode.org/.
For example, \char ##x03BE
and \char #958
would both
enter the Unicode U+03BE character, which has the Unicode name
“Greek Small Letter Xi”.
Any Unicode code point may be entered in this way and if all special characters are entered in this format it is not necessary to save the input file in UTF-8 format. Of course, a font containing all such encoded characters must be installed and available to LilyPond.
The following example shows Unicode hexadecimal values being entered in four places – in a text mark, as articulation text, in lyrics and as stand-alone text below the score:
\score { \relative { c''1 \textMark \markup { \char ##x03A8 } c1_\markup { \tiny { \char ##x03B1 " to " \char ##x03C9 } } } \addlyrics { O \markup { \concat { Ph \char ##x0153 be! } } } } \markup { "Copyright 2008--2022" \char ##x00A9 }
To enter the copyright sign in the copyright notice use:
\header { copyright = \markup { \char ##x00A9 "2008" } }
[ << General input and output ] | [Top][Contents][Index] | [ Spacing issues >> ] |
[ < Unicode ] | [ Up : Special characters ] | [ Controlling output > ] |
ASCII aliases
A list of ASCII aliases for special characters can be included:
\paper { #(include-special-characters) } \markup "&flqq; – &OE;uvre incomplète… &frqq;" \score { \new Staff { \repeat unfold 9 a'4 } \addlyrics { This is al -- so wor -- kin'~in ly -- rics: –_&OE;… } } \markup \column { "The replacement can be disabled:" "– &OE; …" \override #'(replacement-alist . ()) "– &OE; …" }
You can also make your own aliases, either globally:
\paper { #(add-text-replacements! '(("100" . "hundred") ("dpi" . "dots per inch"))) } \markup "A 100 dpi."
or locally:
\markup \replace #'(("100" . "hundred") ("dpi" . "dots per inch")) "A 100 dpi."
The replacement is not necessarily a string; it can be an arbitrary markup. On the syntax level, this requires using Scheme quasi-quoting syntax, with a backtick ‘`’ instead of a quote ‘'’ to write the alist.
\markup \replace #`(("2nd" . ,#{ \markup \concat { 2 \super nd } #})) "2nd time"
Aliases themselves are not further processed for replacements.
See also
Notation Reference: List of special characters.
Installed Files: ‘ly/text-replacements.ly’.
[ << General input and output ] | [Top][Contents][Index] | [ Spacing issues >> ] |
[ < Unicode ] | [ Up : Special characters ] | [ Controlling output > ] |