Pavel Panchekha

By

Share under CC-BY-SA.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

Let's Build a Web Browser

At ICFP this year, James and I were discussing the point of "systems" courses like OS, Databases, and Distributed Systems, Compilers. Among the many reasons to take these courses (a focus on performance, learning the low-level APIs, practice writing C, knowing your stack, writing better C/SQL/network code, and of course the importance of these systems in your ordinary computing experience), one reason stood out: the "mystery" students attach to these systems.

That air of mystery leaves students powerless over part of their computing environment. Even though no class project will rival Linux, Postgres, or LLVM, these courses replace magic smoke with concrete and relatable code, architecture, and abstractions. The courses are a success if Linux, Postgres, and LLVM look like a long series of improvements, additions, and optimizations added to a conceptually simple core.

But one commonly-used platform has no associated systems course and maintains its air of mystery—I know this from speaking to industry programmers, students, and faculty—web browsers. So, let's build a web browser!

Goals

This series will walk through building a basic but complete web browser, including networking, GUI, parsing, CSS, and JavaScript. I'll be using Python 3 with as few as possible dependencies, but you can follow along in a language of your choice. You should be able to go through every part except the last with any language that provides a simple graphics library and the basic POSIX APIs.

At every step in the series, we will have a “working” web browser, and every step will tackle one of that browser's glaring problems. This way, there is always working and useful code, and you'll be able to see how to grow and improve a large piece of software.11 This idea is from James, who was inspired by Steve Z's compilers course.

Since it is intended to be pedagogical, the browser will not attempt to be standards-conformant and will be quite restrictive in the HTML, CSS, and JavaScript it can handle. It also won't handle errors gracefully, be resilient against malicious inputs, or be fast—all important goals for a real browser, but ones impossible to meet given the time constraints of a course. However, the overall architecture will mostly match real browsers. Swapping out some components for real libraries could be the goal of a student project.

I hope to turn this series into a course. If you use these materials, let me know.

Posts

The posts in this series are meant to be read in order and describe how browsers:

Acknowledgements

Thank you to James R. Wilcox, with whom I came up with the idea for this course, and who helped develop the sequence of posts; to Max Willsey, who proof-read each post; and to Zach Tatlock, who encouraged me to develop the course.

Footnotes:

1

This idea is from James, who was inspired by Steve Z's compilers course.