Pavel Panchekha

By

Share under CC-BY-SA.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

How Browsers Draw Text on the Screen

Series

This is post 2 of the Let's Build a Web Browser series.

Once a web browser has downloaded a web page, it has to show that web page to the user. Since we are not savages,11 For most of 2011, I used the command-line browser w3m as my main browser. It built character. browsers use a graphical user interface to draw web pages in a mix of fonts, colors, and styles. How does it do that? Well, in short, it talks to the operating system to create a window, walks through the page HTML, and draws text on that window while changing styles as the HTML tags dictate.

Table of Contents

Creating windows

On desktop and laptop computer, users run operating systems that provide desktop environments which contain windows, icons, menus, and a pointer.22 This is sometimes acronym'd into “WIMP environment”, possibly a snide dig from terminal diehards. This desktop environment provided by some component of the operating system, which handles important jobs like keeping track of the pointer and where the windows are, talking to applications to get windows contents and tell them about clicks, and pushing pixels to the screen.

In order to draw anything on the screen, a program has to communicate with this operating system component. This communication usually involves:

  • Asking the OS to allocate space for a new window and track it
  • Keeping track of some kind of identifier for this window
  • Acting on messages from the OS about keyboard and mouse event
  • Redrawing the window contents periodically33 Applications have to redraw the window sixty times per second or so for interactions feel fluid. On older systems, applications drew directly to the screen, and if they didn't update, whatever was there last would stay in place, which is why in error conditions you'd often have one window leave “trails” on another. Modern systems use a technique called compositing to avoid this (at the cost of using more memory), but even so applications that change need to redraw the screen within a sixtieth of a second.

Since doing all of this by hand is a bit of a drag, this is usually wrapped up in libraries called graphical toolkits. There's one in Python that comes built-in called tkinter.44 The library is called tk and it was first written for a different language called Tcl. Python contains an interface to it, hence the name. Using it is quite simple:

import tkinter
window = tkinter.Tk()
tkinter.mainloop()

Here, after importing the library, we call tkinter.Tk() to communicate with the OS in order to create a window on the screen. The OS responds with an identifier for that window that we can use in future communication with the OS. That identifier is stored inside the Tk object that we assign to window.

Then, the final line starts the main loop. This is an important and general pattern for all graphical applications, from web browsers to video games. The main loop internally looks like this:55 This happens in the function Tk_UpdateObjCmd in tkCmds.c in the Tcl/Tk source code. That code is more complex to handle interrupts and errors.

while True:
    for evt in pendingEvents():
        handleEvent(evt)
    drawScreen()

Our simple window above does not need to do a lot of event handling (it ignores all events) and it does not need to do a lot of drawing (on my computer it is a uniform gray). But when graphical applications get more complex having a main loop is a good way to make sure that all events are eventually handled and the screen is eventually updated, which is essential to a good user experience.

Drawing to the window

A graphical application extends the handleEvent and drawScreen functions to draw interesting stuff on the screen and react when the user clicks on that stuff. Let's start by drawing some text on the screen.

We are going to draw text on the screen using a canvas,66 You may be familiar with the HTML <canvas> element. Different environment (inside web pages, not browsers) but same idea: a 2D rectangle where you can draw shapes. a rectangular region of the window that we can draw circles, lines, and text in. Tk also has higher-level abstractions like buttons and dialog boxes. While these abstractions are useful for many application, we won't be using them: web pages have a lot of control over how they should look, control Tk's higher-level abstractions don't provide. (This is why desktop applications look much more uniform than web pages do—desktop applications are generally written using the abstractions provided by an operating system's most common graphical toolkit, which limit their creative possibilities.)

To create a canvas in Tk, we insert the following code between the tkinter.Tk() call and the tkinter.mainloop() call:

canvas = tkinter.Canvas(window, width=800, height=600)
canvas.pack()

The first line creates a Canvas object inside the window we already created. We pass it some arguments that define its size; I chose 800×600 because that was a common old-timey monitor size.77 This size, called Super Video Graphics Array, was standardized in 1987, and probably seemed super then. The second line is something particular Tk, which requires us to call pack on all widgets like a canvas to position them inside their parent (the window).

Adding these two lines won't yet change how the window appears, since we haven't drawn anything to the canvas. To do that, you can call methods on the canvas whose names begin with create_:

canvas.create_rectangle(10, 20, 400, 300)
canvas.create_oval(100, 100, 150, 150)
canvas.create_text(200, 150, text="Hi!")

You ought to see a rectangle, starting near the top-left corner of the canvas and ending at its center, then a circle inside that rectangle, and then the text “Hi!” next to the circle.

Play with some of the arguments to those methods—which coordinate does each number refer to? Check that you got it right against online documentation. It is important to remember that coordinates in Tk, like (10, 20), refer first to X position from left to right and then to Y position from top to bottom. This means that lower on the screen has a larger Y value, the opposite of what you might be used to from math.

What is a font?

At a basic level, writing text to a canvas just requires calling the create_text function and giving it two coordinates. That works if you don't care much about the font, or the size, or the color, or the exact position of the text. When you do care about those things, you need to create and use font objects.

What is a font, exactly? In the olden days, to print text on paper, you grabbed little metal shapes and arranged them together, then covered them with ink and pressed them to a sheet of paper. The metal shapes came in boxes, one per letter, so you'd have a (large) box of e’s, a (small) box of x’s, and so on. The set of all of the boxes was called a font. Naturally, if you wanted to print larger text, you needed different (bigger) shapes, so those were a different font. If you had several fonts, of different sizes but the same general shape, that was called a typeface. Then, you might also have several different font faces that were related, and that was called a type (so that a “typeface” was one of the possible “faces” of the “type”).

This nomenclature was important because it reflected the fundamental constraints of working with little pieces of metal: there were lots of boxes, the boxes were in cases (hence lower- and uppercase letter), the cases were on shelves, they came in different types, and so on. In the shiny modern world, you can use the word font to refer to fonts, typefaces, or types, because the distinctions don't matter much.88 The term “font family” was invented to specifically refer to types, and now has also become confusing and blurry. So nowadays, a font contains several different weights (like “bold” and “normal”),99 But sometimes other weights as well, like “light”, “semibold”, “black”, and “condensed” several different styles (like "italic" and "roman", which is what not-italic is called),1010 Sometimes there are other options as well, like maybe there's a small-caps version; these are sometimes called options as well. and can be rendered at an arbitrary size.1111 But usually the font looks especially good at certain sizes where hints tell the computer how to best resize the font to a particular pixel size.

In Tk, you can create font objects, which correspond to what an old-timey designer would call a font: a type at a fixed size, style, and weight. For example:

import tkinter.font
TEXT_SIZE = 16
font_bi = tkinter.font.Font(family="Times", size=TEXT_SIZE, weight="bold", slant="italic")

Once you have a font, you can use it with create_text using the font keyword argument:

canvas.create_text(200, 100, text="Hi!", font=font_bi)

Laying out text horizontally

Text takes up space vertically and horizontally. In Tk, there are two functions that measure this space:

>>> font_bi.metrics()
{'ascent': 15, 'descent': 7, 'linespace': 22, 'fixed': 0}
>>> font_bi.measure("Hi!")
31

The metrics() call gives information about the vertical spacing of the text: the linespace is how tall the text is, which includes an ascent which goes “above the line” and a descent that goes “below the line”.1212 The fixed parameter is actually a boolean and tells you whether all letters are the same width, so it doesn't really fit here. You end up caring about the ascent and descent if you have text of different sizes on the same line: you want them to line up “on the line”, not along their tops or bottoms.

On the other hand, the measure() call tells you about the horizontal space the text takes up. This obviously depends on what text you're rendering, since different letters have different width:1313 Note that that summation at the end doesn't always work out so neatly, where the width of a word is the sum of the widths of its letters. That's because Tk always returns whole pixels, but internally might do some rounding, plus some fonts use something called kerning to shift letters a little bit when particular pairs of letters are next to one another.

>>> font.measure("H")
17
>>> font.measure("i")
6
>>> font.measure("!")
8
>>> 17 + 8 + 6
31

You can use this information to lay text out on the page. For example, suppose you want to draw the text “Hello, world!” in two pieces, so that “world!” is italic. You can do:

font1 = tkFont.Font(family="Times", size=TEXT_SIZE)
font2 = tkFont.Font(family="Times", size=TEXT_SIZE, slant=tkFont.ITALIC)
x = 200
y = 200
canvas.create_text(x, y, text="Hello, ", font=font1)
x += font1.measure("Hello, ")
canvas.create_text(x, y, text="world!", font=font2)

This should work, giving you nicely aligned “Hello,” and “world!”, with the second italicized.

There is a hidden bug in this code, however, that happens not to occur for “Hello, world!”. Replace, for example, “world!” with “overlapping!”: you'll find that the two words overlap. That's because the coordinates x and y that you pass to create_text tell Tk where to put the center of the text. So, instead of incrementing x by the length of “Hello,”, you need to increment it by half the length of “Hello,” and half the length of “overlapping!”. It only worked for “Hello, world!” because those two halves happen to be the same length!

Instead of doing this more complicated math, we can instruct Tk to treat the coordinate we gave as the top-left corner of the text using the anchor argument:

font1 = tkFont.Font(family="Times", size=TEXT_SIZE)
font2 = tkFont.Font(family="Times", size=TEXT_SIZE, slant=tkFont.ITALIC)
x = 200
y = 225
canvas.create_text(x, y, text="Hello, ", font=font1, anchor='nw')
x += font1.measure("Hello, ")
canvas.create_text(x, y, text="overlapping!", font=font2, anchor='nw')

The anchor argument here is set to nw, meaning the “northwest” or top left corner of the text.

Styling text

We can use this technique to turn the text-only browser we build last time into a graphical browser. Instead of calling print to print each letter, instead use create_text, like above, and then increment the x position:

font = tkFont.Font(family="Times", size=16)
x, y = 0, 0
in_angle = False
for c in body:
    if c == "<":
        in_angle = True
    elif c == ">":
        in_angle = False
    elif not in_angle:
        canvas.create_text(x, y, text=c, font=font, anchor='nw')
        x += font.measure(c)

Your code will look a little different if you've done the assignments from last time, because it'll contain additional code to not print text outside of the <body> tag, so you'll need to make a few modifications to what you see above to apply the same changes to your code.

If you run this code, you'll probably feel that the text is uncomfortably close to the edge of the window. Let's fix that for now by adding some padding around the edges of the canvas:

PADDING = 8
x, y = PADDING, PADDING

Now that we are drawing the text on the page graphically, let's try to draw different parts of the text in different styles. As we discussed, this means using different font objects for different parts of the text. Let's have four different styles, corresponding to bold/normal and italic/roman choices:

fonts = { # (bold, italic) -> font
    (False, False): tkFont.Font(family="Times", size=TEXT_SIZE),
    (True, False): tkFont.Font(family="Times", size=TEXT_SIZE, weight="bold"),
    (False, True): tkFont.Font(family="Times", size=TEXT_SIZE, slant="italic"),
    (True, True): tkFont.Font(family="Times", size=TEXT_SIZE, weight="bold", slant="italic"),
}

This dictionary maps pairs of booleans to font objects; the first element of the boolean tells you whether or not the font is bold and the second tells you whether or not it is italic. So we can have two variables, bold and italic, and use them to select the font to use when printing some text:

canvas.create_text(x, y, text=c, font=fonts[bold, italic], anchor='nw')
x += fonts[bold, italic].measure(c)

We can update those variables every time we see a tag. Remember how you used the tag variable to update whether or not you were between <body> tags? Now we are going to do the same thing to update bold and italic:

elif c == ">":
    if tag == "i":
        italic = True
    elif tag == "/i":
        italic = False
    elif tag == "b":
        bold = True
    elif tag == "/b":
        bold = False
    # update in_body...
    in_angle = False

Note that this code correctly handles not only <b>bold</b> and <i>italic</i> text, but also <b><i>bold italic</i></b> text. But note also that it doesn't handle <b>accidentally <b>double</b> bolded</b> text.

Summary

The last post build a simple, purely-command-line browser. Now we've significantly upgraded it by introducing a rudimentary graphical user interface, which can:

  • Create a graphical window
  • Lay out a line of text
  • Understand basic HTML tags like <b> and <i>
  • Draw styled text in different fonts

You might say that the browser is still a little one-dimensional, but it forms the graphical core we will be building on.

Assignment

Take the browser implementation you have after completing the assignments from the previous post, and made the modifications described in this post.

You should be able to call your browser from the command line with a URL, and see the first line of text from that page in a graphical window. If that line contains any bold or italic text, you should see that displayed correctly.

At this point, the show function is getting pretty complicated, so let's separate it into two pieces:

lex(body)
Takes in a string containing HTML and return a list of tokens. Every token is either a Text object (for a run of characters outside a tag) or a Tag object (for the contents of a tag). You'll need to write the Text and Tag classes.1414 If you're familiar with Python, you might want to use the namedtuple helper in the collections library, which makes it really easy to define these sorts of very simple classes
gui(tokens)
Takes in a list of tokens, creates a window and a canvas, and draws the tokens to the canvas

It should be possible to string these functions together like so:

tokens = lex(body)
show(tokens)

Finally, make some improvements to the code:

  • Look through the options you can pass to the Canvas constructor. Change the canvas to have a white background and give it a red border. (This will help you see where the borders of the page are.)
  • Add support for <code> tags. Text enclosed in those tags should use a different font like Courier New or SFMono (choose a font that looks like code for you).
  • Add support for drawing text in a tags in blue. You can change the color of text using the color argument to create_text. Be careful: a tags, which represent links, usually have attributes, so you need to handle a tags with attributes correctly. Underline the text (in blue) using create_line.
  • Change the default font to be a code font when the content type is not text/html. Keep text/html's default font as Times.
  • Change the bold and italic state variables to numbers that you increment and decrement on open and close tags, so that <b><b>double</b> bold</b> renders with both “double” and “bold” bolded.
  • Change the show function not to call create_text for whitespace characters like spaces.

Footnotes:

1

For most of 2011, I used the command-line browser w3m as my main browser. It built character.

2

This is sometimes acronym'd into “WIMP environment”, possibly a snide dig from terminal diehards.

3

Applications have to redraw the window sixty times per second or so for interactions feel fluid. On older systems, applications drew directly to the screen, and if they didn't update, whatever was there last would stay in place, which is why in error conditions you'd often have one window leave “trails” on another. Modern systems use a technique called compositing to avoid this (at the cost of using more memory), but even so applications that change need to redraw the screen within a sixtieth of a second.

4

The library is called tk and it was first written for a different language called Tcl. Python contains an interface to it, hence the name.

5

This happens in the function Tk_UpdateObjCmd in tkCmds.c in the Tcl/Tk source code. That code is more complex to handle interrupts and errors.

6

You may be familiar with the HTML <canvas> element. Different environment (inside web pages, not browsers) but same idea: a 2D rectangle where you can draw shapes.

7

This size, called Super Video Graphics Array, was standardized in 1987, and probably seemed super then.

8

The term “font family” was invented to specifically refer to types, and now has also become confusing and blurry.

9

But sometimes other weights as well, like “light”, “semibold”, “black”, and “condensed”

10

Sometimes there are other options as well, like maybe there's a small-caps version; these are sometimes called options as well.

11

But usually the font looks especially good at certain sizes where hints tell the computer how to best resize the font to a particular pixel size.

12

The fixed parameter is actually a boolean and tells you whether all letters are the same width, so it doesn't really fit here.

13

Note that that summation at the end doesn't always work out so neatly, where the width of a word is the sum of the widths of its letters. That's because Tk always returns whole pixels, but internally might do some rounding, plus some fonts use something called kerning to shift letters a little bit when particular pairs of letters are next to one another.

14

If you're familiar with Python, you might want to use the namedtuple helper in the collections library, which makes it really easy to define these sorts of very simple classes