Notation benchmarkingA flexible architecture is necessary for good formatting. Unfortunately, it is not sufficient. Only a careful emulation of printed matter will give a good result. We suggested in the introduction to compare program output with existing hand-engraved scores. It is exactly this technique that we use to perfect LilyPond output. In a way, this is a benchmarking technique: the performance of the program, in terms of quality, is measured in relation to a known quantity.
Here you see parts of a benchmark piece. At the top the reference edition (Bärenreiter BA 320) at the bottom the output from LilyPond 1.4:
Bärenreiter (click to enlarge)
The LilyPond output is certainly readable, and for many people it would be acceptable. However, close comparison with a hand-engraved score showed a lot of errors in the formatting details:
- Lots of symbols were unbalanced. In particular the trill sign was too large.
- Stems and beams were all wrong: the stems were too long, and beam should be slanted to cover staff lines exactly. The beam was also too light.
- The spacing was irregular: some measures were too tight, other too wide.
By addressing the relevant algorithms, settings, and font designs, we were able to improve the output. The output for LilyPond 1.8 is shown below. Although it is not a clone of the reference edition, this output is very close to publication quality.
Another example of benchmarking is our project for the 2.1 series, a Schubert song.
Next: Cool features,
typographical hoops that we made LilyPond jump through.