Pavel Panchekha

By

Share under CC-BY-SA.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

How Browsers Lay Out Text into Paragraphs

Series

This is post 3 of the Let's Build a Web Browser series.

In the last post, we created a graphical window for our web browser to display web pages in, and programmed it to lay out a line of formatted text. Unfortunately, we didn't teach our browser to wrap text into lines, so you could only read the first line of any web page. Let's fix that.

Table of Contents

Wrapping lines

After finishing the prior post and its assignments, you have a show function that looks something like this:

font = tkFont.Font(family="Times", size=TEXT_SIZE)
x, y = PADDING, PADDING

for tok in tokens:
    if isinstance(tok, Text):
        canvas.create_text(x, y, text=tok.text, font=font, anchor='nw')
        x += font.measure(tok.text)

You can see why this isn't wrapping lines of text: we never change y! So when should we change y? Well, any time drawing the text at the current y would make it go past the edge of the screen. So incrementing y is something we have to do before drawing the text:

w = font.measure(tok.text)
if x + w > 800 - PADDING:
    x = PADDING
    y += 28

There's a lot of moving parts to this code. First, we measure the width of the text, and store it in w. We'd normally draw the text at x, so its right end would be at x + w, so we check if that's too big. The overall frame is 800 pixels wide, and we should have symmetric padding on the left and right, so too big is 800 - PADDING pixels.

Now, if we need to move to the next line, we need to increment y and reset x.1 [1 In the olden days of type writers, these would be separate operations: to move down the page you would feed in a new line, and then you'd return the carriage that printed letters to the left of the page. When ASCII was standardized, they added separate characters for these operations: CR and LF. On Windows, it is still standard to indicate new lines with a CR and then an LF, even though computers have nothing mechanical inside that necessitates separate operations.] I chose to increment y by 28. Try changing the value and seeing what happens! In general, how much to increment y is going to depend on the size of the text. Here, we're using 16 pixel Times, so a 28 pixel gap might seem pretty big. But let's dig deeper. Remember font.metrics()? It tells us that this “16 pixel” font is actually 22 pixels tall.2 [2 This kind of misdirection is pretty common. The advertised pixel size describes the font's ascent, not its full size. Which for this font is 15 pixels; 16 pixels is how big the font “feels”. It's like clothing sizes.] So we need to increment y by at least 22 pixels for it to be impossible for text to overlap. But if you try changing 28 to 22, you'll see that the text is hard to read because the lines are too close together.3 [3 Designers say the text is too “tight”.] Instead, it is common to add “line spacing” or “leading”4 [4 So named because in metal type days, the little pieces of metal that were placed between lines were made of lead. Lead is a softer metal than what the actual letter pieces were made of, so it could compress a little to keep pressure on the other pieces. Pronounce it “led-ing” not “leed-ing”.] between lines. 28 pixels here adds 6 pixels of leading, or 27%, which is pretty normal.

Of course, we shouldn't have numerical constants like that in the codebase.

y += round(font.metrics('linespace') * 1.3)

Word boundaries

Now, this works a little better, but right now, the browser can only wrap text at token boundaries. A token is a contiguous run of text, not including any tags, so it will often contain multiple words. So we need to first split a token into words, then draw each word:

for word in tok.text.split():
    w = font.measure(word)
    if x + w > 800 - PADDING:
        x = PADDING
        y += round(font.metrics('linespace') * 1.3)
    canvas.create_text(x, y, text=word, font=font, anchor='nw')
    x += w

This sort of works—but now we're missing spaces between words. We could change the last line to:

x += w + font.measure(" ")

This works better, but it assumes there is a space after every word, so for HTML code like "I'm so <i>excited</i>!", it'll put a space after excited, making the exclamation mark look weird. But we can't always drop the space after the last word, because then we'd lose the space after so. So we need to check whether the run of text starts or ends with whitespace:5 [5 Note that tokens should never contain an empty string of text, so the [-1] access is valid.]

if tok.text[0].isspace():
    x += font.measure(" ")

words = tok.text.split()
for i, word in enumerate(words):
    w = font.measure(word)
    if x + w > 800 - PADDING:
        x = PADDING
        y += round(font.metrics('linespace') * 1.3)
    canvas.create_text(x, y, text=word, font=font, anchor='nw')
    x += w + (0 if i == len(words) - 1 else font.measure(" "))

if tok.text[-1].isspace():
    x += font.measure(" ")

This is very close, but handles odd input like "I'm <i> so </i> excited!" incorrectly: it draws two spaces in a row, while HTML dictates that two pieces of whitespace one after another should merge. We need extra state to track if we inserted a final space, and check it when we insert an initial space:

# At the top of the `show` function
terminal_space = True

# In the text token case
if tok.text[0].isspace() and not terminal_space:
    x += font.measure(" ")

words = tok.text.split()
for i, word in enumerate(words):
    w = font.measure(word)
    if x + w > 800 - PADDING:
        x = PADDING
        y += round(font.metrics('linespace') * 1.3)
    canvas.create_text(x, y, text=word, font=font, anchor='nw')
    x += w + (0 if i == len(words) - 1 else font.measure(" "))

terminal_space = tok.text[-1].isspace()
if terminal_space and words:
    x += font.measure(" ")

The state variable terminal_space is set at the end of every text token, and read at the beginning of the next text token. Note that the code no longer prints the terminal space if the text had no words in it. That's because in that case, the terminal space is also the initial space and so has already been printed. A final tweak: terminal_space should start True, because if the first thing in a line is a space, you shouldn't actually indent that line.

I bet you've never thought this much about the spaces between words.

Separating paragraphs

The browser now lays out all the text on the page into a single giant block of text. Huge improvement: we can now actually read beyond the first line of text. But it's hard to follow what's going on without paragraph breaks.

In HTML, text is grouped into paragraphs by wrapping each paragraph with the <p> tag. Just like our browser looks for the <b> and <i> tags to change which font it uses, it needs to look for <p> tags to implement paragraphs.

if isinstance(t, Tag):
    if t.tag == "/p":
        terminal_space = True
        x = PADDING
        y += round(font.metrics('linespace') * 1.3 * 15)
    # elif other tags ...

The end of a paragraph is the end of a line of text, so we need to reset x and increment y. It is also a good time to set up for the next paragraph. Here I increment y by a further half line, to make a little gap between paragraphs, and also reset terminal_space.6 [6 You could probably do this second increment and terminal_space reset in the open tag <p> instead of the close tag </p> like here, but on many pages it would lead to pretty weird-looking output because many pages have elements other than <p> that go between paragraphs (like headings).]

Given how complicated wrapping lines was, paragraph wrapping is very easy to implement!

Scrolling text

We started this post with the problem that you could only read one line of text in our browser. Now we can read several paragraphs! But if there's enough text, these paragraphs don't fit on the screen, and there's still content you can't read. Every browser solves this problem by allowing the user to scroll the page and look at different parts of it.

Scrolling introduces a layout of indirection between page coordinates (this text is 132 pixels from the top of the page) and screen coordinates (this text is 72 pixels from the top of the screen). Generally speaking, a browser lays out the page in terms of page coordinates—determines where everything on the page goes—and then renders the page in terms of screen coordinates.

Let's introduce the same split in our browser. Right now we have a show function that takes in a list of tokens and then creates a graphical window, computes the position of each bit of text, and then draws that text to the screen. Let's split it into a layout function that just computes the position of each bit of text, and a render function that creates the window and draws each bit of text on the screen. Only render needs to think about screen coordinates, while layout can operate on page coordinates alone.

What should the interface between these two functions be? Well, render only needs to know which text to place where, so what about having layout just returning a list of tuples: the text to draw, its x and y coordinates, and the font to use? Then render could loop through that list and draw each:

display_list = layout(source)
for x, y, word, font in display_list:
  canvas.create_text(x, y, text=word, font=font, anchor='nw')

I am calling this list that layout returns a display list, since it is a list of things to display; the term is standard. Creating that list is easy. Right now, we loop over each token, and loop over every word in that token, and call canvas.create_text. Now instead of calling canvas.create_text, we add it to a list:

display_list = []
for tok in tokens:
  ...
  for word in words:
    ...
    display_list.append((x, y, word, font))
return display_list

Now if we want to scroll the whole page by, say, 100 pixels, we can change the create_text parameter from y to y - 100. More generally, let's add a scrolly state variable and subtract that from the y position when we render text:

scrolly = 0
for x, y, word, font in display_list:
  canvas.create_text(x, y - scrolly, text=word, font=font, anchor='nw')

If you change the value of scrolly the page will scroll up and down. So how do we change the value of scrolly?

Reacting to keyboard input

Most browsers scroll the page when you press the up and down keys, rotate the scroll wheel, or drag the scroll bar. Let's keep things simple and implement the first of those.

Tk allows you to bind certain keyboard buttons, and call a specific function when then that key is pressed. For example, to call the scrolldown function when the "Down" button is pressed, we write:

window.bind("<Down>", scrolldown)

Note that I wrote scrolldown, not scrolldown(): I'm not calling the function, I'm just writing its name. Tk will call the function, when the user presses the "Down" button. To implement scrolldown, we need to increment y and then re-draw the canvas:

SCROLL_STEP = 100
scrolly = 0

def render():
    for x, y, word, font in display_list:
        canvas.create_text(x, y - scrolly, text=word, font=font, anchor='nw')

def scrolldown(e):
    nonlocal scrolly
    scrolly += SCROLL_STEP
    render()

render()

There are some pretty big changes here. First, I've moved the loop that draws all the text into a function, render. That function is called immediately when the page is first rendered (last line above). But it is also called when you scroll down, so that the page can be redrawn.7 [7 The nonlocal scrolly line is something Python-specific. It tells Python that when we increment scrolly, that's not some local variable that we're modifying, it's a variable defined in some larger scope.]

If you try this out, you'll find that scrolling causes all the text to be drawn twice. That's because we didn't erase the old text when we started drawing the new text. To do that, we call canvas.delete:

canvas.delete('all')

You can write a scrollup function that's just like scrolldown. However, there's a small twist: we don't want scrolly going negative, since you shouldn't be able to scroll "above" the page:

def scrollup(e):
    nonlocal scrolly
    scrolly -= SCROLL_STEP
    if scrolly < 0: scrolly = 0
    render()

Summary

The last post introduced a browser that could draw a single line of text. Now we've added a second dimension to our browser:

  • Text is laid out in multiple lines
  • Paragraphs are separated from one another
  • Spacing rules are obeyed
  • You can scroll up and down to see more text

The browser is now good enough to read an essay or a blog, even if its stylistic capabilities are a little limited.

Assignments

Take the browser implementation you have from the previous post and make the changes described in this post. Your browser should now be able to lay out multiple lines of text, broken into paragraphs, and scroll the page up and down.

At this point, your browser implementation should contain the following functions:

parse(url)
Takes in a string and returns the scheme, the host, the port, the path, the query, and the fragment (all strings except the port, which is an integer)
request(host, port, path)
Takes in a string host, a numeric port, and a string path, and returns a string containing the headers (as a dictionary) and the page contents (as a string)
lex(body)
Takes in a string and returns a list of tokens, which are either Text or Tag (both of which are wrappers around a string)
layout(tokens)
Takes in a list of tokens, and produces a list of rendering commands, which are an x position, a y position, a font, and a string of text to draw
show(tokens)
Creates a GUI and draws the page to it, using layout(tokens) and helper functions render(), scrollup(), and scrolldown()

Now make the following improvements to the code:

  • Right now, if you have a heading (<h1>) followed by a paragraph, the heading just becomes part of the first line of the paragraph. Put headings on their own line.
  • Add support for the <pre> tag. Unlike normal paragraphs, text inside <pre> tags shouldn't break lines, and whitespace like spaces and newlines should be preserved. You should also use the same font for text in <pre> tags as you use for text in <code> tags.
  • Add support for the <small> tag. Text surrounded by this tag should use 12 pixel Times (instead of 16 pixel Times). Make sure that when small text is mixed with normal-sized text, like in "A <small>little</small> thing", the small text looks like it lines up with the normal-sized text (the point on the text where the ascent and the descent meet should be at the same y position for both the small and the normal-sized text).
  • Add support for the <sup> tag. Text surrounded by this tag should use small text, but it should be placed higher, like a superscript. (Raising the text by half the ascent height of normal text should be sufficient.)
  • Add support for the <big> tag. Text surrounded by this tag should use 20 pixel Times (instead of 16 pixel Times). Make sure that when text sizes are mixed, like in "A <big>huge</big> deal", the big text looks like it lines up with normal-sized text. This is much harder than implementing <small>: make sure that multiple lines of <big> text don't overlap.

Footnotes:

1

In the olden days of type writers, these would be separate operations: to move down the page you would feed in a new line, and then you'd return the carriage that printed letters to the left of the page. When ASCII was standardized, they added separate characters for these operations: CR and LF. On Windows, it is still standard to indicate new lines with a CR and then an LF, even though computers have nothing mechanical inside that necessitates separate operations.

2

This kind of misdirection is pretty common. The advertised pixel size describes the font's ascent, not its full size. Which for this font is 15 pixels; 16 pixels is how big the font “feels”. It's like clothing sizes.

3

Designers say the text is too “tight”.

4

So named because in metal type days, the little pieces of metal that were placed between lines were made of lead. Lead is a softer metal than what the actual letter pieces were made of, so it could compress a little to keep pressure on the other pieces. Pronounce it “led-ing” not “leed-ing”.

5

Note that tokens should never contain an empty string of text, so the [-1] access is valid.

6

You could probably do this second increment and terminal_space reset in the open tag <p> instead of the close tag </p> like here, but on many pages it would lead to pretty weird-looking output because many pages have elements other than <p> that go between paragraphs (like headings).

7

The nonlocal scrolly line is something Python-specific. It tells Python that when we increment scrolly, that's not some local variable that we're modifying, it's a variable defined in some larger scope.