Pavel Panchekha

By

Share under CC-BY-SA.

Was Lower Case Designed for Reading Speed

In an interview with Devon Zeugel, Stu Card mentions that lower case had to be invented, and that it had the benefit of increasing reading speed:

The Carolingian Renaissance invented small letters. Those small letters put together with the capitals allowed words to have shapes. They could be then read in faster speed.

Now, I think the usability theory here is disputed,1 [1 See Kevin Larson via Susan Weinschenk.] and the history is not quite right2 [2 Stu seems to be referring to Carolingian miniscule, which was indeed a popular, widespread writing style with lower case letters, but other styles at the time like Insular and Uncial had lower case as well, though everything at the time is complicated and I've only read a Wikipedia article so don't quote me.] but the idea that writing style influences reading speed was fascinating to me.3 [3 Other people I mention this too are similarly intrigued. Seems everyone assumes style affects writing speed, not reading speed.]

The story Stu tells is that lower-case letters have shapes: f has an ascender, g has a descender, while x has neither.4 [4 Actually, Carolingian miniscule, from looking over some photos of documents, is pretty different from standard modern lower-case. For example, g both ascends and descends, and s, which is written as a "long s" ſ, descends.] Those three seem like the only available categories, so you can take a word like "letters" and transform it into its "shape": fxffxxx. Since you can recognize a shape quickly, and you can often guess the word from the shape, you can read more quickly.5 [5 In upper case, all letters have the f type, so you can only see the length from the shape.]

Grading Letter Shapes

We can quantify this. Suppose you have a set of words \(w\) and their frequencies \(p_w\). Then to determine what word you're looking at, you need \(s_\inf = \sum_w p_w \log p_w\) bits of information. If each word is placed in a shape-class \(f_w\) with frequency \(q_f = \sum_w p_w [f_w = f]\), then just knowing the shape classes gives you \(\sum_f q_f \log q_f\) bits of information.

Let's accept the potted history with Carolingian miniscule so that we're looking at Vulgar Latin words from Jerome's bible. I couldn't find the full text of that online, but I did find the Clementine Bible,6 [6 From the Clementine Vulgate Project] which at least is in Latin, though a much later Latin.7 [7 This actually has quite an effect: in Carolingian times, j and v were uncommon, plus a variety of ligatures weren't used. Let me know if you have a full text of Jerome's bible.] From that we can compute a dictionary and a frequency table; this gives a total of \(s_\inf = 11.08\) bits of information in each word, with \(s_0 = 3.30\) given just by lengths, that is, by the shape classes given by upper case text.

Someone had to design the lower-case letters,8 [8 That person was Alcuin of York.] and they may have been trying to optimize reading speed! We can use this minimum \(s_0\) and maximum \(s_\inf\) as a ruler to grade how well they did. A score of 0 means assigning every letter the same shape, and achieving \(s_0\) information; assigning every letter a different shape, so that every word has its own shape, gives a score 1 and \(s_\inf\) information.

Well, I took my best guess at the letter classes in Carolingian miniscule, and found that on this scale, Carolingian miniscule gets 64.4%. Not bad! Two-thirds of available word information is communicated by word shape! Does this mean Carolingian miniscule was designed toward this goal?

One way to test this is to see how well you would do without doing any design: randomly. Let's say we assign one of three letter shapes randomly to each letter. That gives us a distribution of scores; where does Carolingian miniscule fall? Roughly at the first percentile.9 [9 Precisely, the 1.397th percentile, out of a sample of a few hundred random choices.] Not the 99th percentile! No, Carolingian miniscule is terrible, compared to random assignment. My random samples in general span scores from roughly 65% to roughly 85%.

Conclusion

So it sounds like reading speed, especially with regards to shape, was not a particular concern of Alcuin of York when designing a script, and in fact he somehow managed to optimize against it!

On the other hand, it sounds like just about any lower-case design with three classes works well, so this doesn't necessarily debunk the notion of that lower case helps reading speed. I also did a trial with two letter classes ("tall" and "short"); Carolingian miniscule gets a score of 51.5%, which is at the 10.4th percentile of random samples spanning 47–62%. That's still not impressive, though maybe a little better and perhaps more representative of Alcuin's efforts.

Footnotes:

2

Stu seems to be referring to Carolingian miniscule, which was indeed a popular, widespread writing style with lower case letters, but other styles at the time like Insular and Uncial had lower case as well, though everything at the time is complicated and I've only read a Wikipedia article so don't quote me.

3

Other people I mention this too are similarly intrigued. Seems everyone assumes style affects writing speed, not reading speed.

4

Actually, Carolingian miniscule, from looking over some photos of documents, is pretty different from standard modern lower-case. For example, g both ascends and descends, and s, which is written as a "long s" ſ, descends.

5

In upper case, all letters have the f type, so you can only see the length from the shape.

7

This actually has quite an effect: in Carolingian times, j and v were uncommon, plus a variety of ligatures weren't used. Let me know if you have a full text of Jerome's bible.

8

That person was Alcuin of York.

9

Precisely, the 1.397th percentile, out of a sample of a few hundred random choices.