What is the Complexity of Selector Matching?

CSS selectors are part of the CSS language used for styling web pages on the web. CSS stylesheets are made up of rules, where each rule has a selector that defines which elements on the page the rule applies to. When the browser loads a stylesheet, it needs to match those selectors to the elements on the page, to determine which rules apply to which element. Here's my question: what is the computational complexity of this matching step?

Preliminaries

Here I'm considering the matching step in isolation from every other part of the browser. This means that its input is a parsed HTML tree and a list of parsed CSS selectors, and its output is, for each HTML element, the subset of selectors that match it. If we have \(n\) HTML elements and \(r\) CSS selectors, the size of the output is therefore \(O(n r)\) in the worst case. Meanwhile the input has size \(O(n + r)\).

You can definitely construct cases where the output is \(O(n r)\), for example by assigning each element many classes and matching all those classes by various rules. That means it's hard to do better than a worst-case runtime of \(O(n r)\). Now, that's not a proof that \(O(n r)\) is a lower bound. Typically, relatively few selectors will match every element, and it's possible that we should be able to do those cases faster—but for now, let's just focus on whether \(O(n r)\) is achievable. Likewise, the step after selector matching is style computation, which produces a result of size \(O(n)\). So you might wonder if perhaps by combinining selector matching and style computation you might be able to go faster, basically by avoiding building an \(O(n r)\) intermediate. I won't think about this, though, because selector matching is already complicated enough. Finally, a browser actually spends a lot of time handling the incremental version of selector matching: re-matching selectors after the HTML tree (or the CSS stylesheet) is modified in a small way. I'm not going to cover that here (but a blog post about it is coming).

So the new question is: can we match all selectors against all nodes in \(O(n r)\) time. One way to think of this is "amortized constant time" matching, in the sense that we have \(O(n r)\) outputs computed in \(O(n r)\) time which is kind of like \(O(1)\) time per output.

Descendants

A naive implementation of selector matching is not optimally efficient, basically due to the descendant selector. A descendant selector is something like body p, which matches an <p> element that is a descendant (not necessarily a child) of a <body> element. To match a selector like this, against an element, you first check whether that element is a <p> tag and, if it is, walk up the tree from that element, checking whether each ancestor is a <body> element. The issue with this naive implementation is that a tall, skinny tree can have depth O(n), and therefore this process can take O(n) time for a single node and a single rule. This leads to an overall runtime of \(O(n^2 r)\).

Actual browsers do not have this behavior, however. I tested Google Chrome against a page consisting of \(n\) nested <div> elements, with the bottom-most <div> containing \(n\) <p> elements. Using the Chrome developer tools, I got the following timings:

N	body p	body p:nth-child(even)
5	0.3	0.2
50	0.6	0.7
500	14.4	13.0
5000	141.1	124.2
50000	1405	1784.6

Note that the timings increase linearly, both for body p and body p:nth-child(even). Now, I'm not sure what Chrome is doing that causes a linear increase. It could be using a clever algorithm, like I describe below. But it also could be using some kind of clever cache, and that's why I tried both the normal and the even variation: I was hoping that if there was a cache of some kind, the even version would be too complicated to cache. But that didn't happen. Worth investigating.

So is there an algorithm that guaranteed works? Actually, yes. It requires a powerful insight: instead of thinking one node at a time, we must think about the whole tree at once. When you are looking at a single node, your only choice is to start at the right hand side of the selector and walk up the tree and left along the selector at the same time. But if you think about the whole tree at once, you can think about walking down the tree and right along the selector.

Here's a more rigorous description. Consider a fixed selector a b c ... f. Start at the top of the HTML tree and set your current state to 0, meaning you're currently looking for the first entry in the selector. At each node, check the current node against the entry that you're looking for in the selector. If it matches, recurse into each child with the state incremented; otherwise recurse into each child with the state not incremented. Instead of incrementing the state past the last entry in the selector, mark the current node as matching the selector. This does a single top-down traversal of the HTML tree, so necessarily runs in \(O(n)\), or in other words in \(O(n r)\) time since we are considering a single selector.

Alternatively, you can think of this algorithm as a little state machine, which travels down the tree from parent to child and transitions according to the following states:

State	Meaning	a	b	c	d	f
0	*	1	0	0	0	0
1	a *	1	2	1	1	1
2	a b *	2	2	3	2	2
3	a b c *	3	3	3	4	3
4	a b c d *	4	4	4	4	(4)

Here the bottom-right-most state transition, (4) means that children are still in state 4 but that the current node is marked as matching.

A similar challenge, and a similar solution, apply to the ~ selector, which is like the descendant selector but applies for arbitrary later siblings instead of arbitrary descendants.

Nth-child

The :nth-child selectors I used above is also tricky to do. The argument of :nth-child is kind of a pattern that applies to the element's index in its parent. So :nth-child(even) selects even-numbered elements of its parent, while :nth-child(7n+3) selects elements whose number is three modulo seven. Doing this efficiently means that, when we recurse into a child, we must carry along with us the index at which that element appears. Otherwise, we need to search the parent for the child, which can take \(O(n)\) time.

Ancestors

Another tricky kind of selector is the :has selector. A selector like a:has(b) is similar to the selector a b, except it matches the a element instead of the b element. This introduces two problems. First, how do we do this efficiently? The best option seems to be to match every element against a b, but remember the references to the a node in the state machine that we propagate down. Then once we match against the b node instead of marking that node as matching we follow the pointer back to the a node and mark that as matching.

Then we might mark the a node as matching multiple times in a row. That might not be a problem for the selector a:has(b), but if we have a combination of ancestor and descendant selectors like a:has(b) c this can lead to us matching the same selector/element pair repeatedly. Instead we need to do this in phases: first match each element against a b; then mark all the a elements that correspond to matches; then match a:marked c. Now we'll need to do as many traversals of the HTML tree as we have :has selectors, which might be \(O(n r)\) if we define \(r\) the right way but is already tricky.

The CSS selector :host-context is also kind of like the ancestor selector: the selector :host-context(a b) c basically means the selector a b c.

A unified algorithm

So for a variety of tricky selectors, there are techniques to do the matching in \(O(n r)\) time. However, it's not totally clear to me whether all these tricks can be combined into a single unified selector matching engine that handles all CSS selectors in \(O(n r)\) time.

Moreover, the "state machine" algorithm seems attractive beyond being the basis of optimal selector matching. Specifically, it's common to have lots of similar selectors, like .sidebar section h1 and .sidebar section p. In the "state machine" model, we can compile all the selectors into a state machine and in the process we might deduplicate some states between the various selectors. Plus, the state machine could possibly be JIT-compiled into efficient machine code, which would make it even faster.

Next steps

One next step is for me to dig into existing selector maching engines. In Chromium, the code is probably in the StyleResolver class; in Mozilla it's in Stylo; in Webkit the CSSJIT is intriguing. In each case the main question is whether the existing code already guarantees \(O(n r)\) maching, or whether it just has a sequence of caches that makes common cases work correctly. Another next step is to try to implement a unified algorithm; I have a test implementation that handles descendant selectors, and it would be nice to extend it with ancestor selectors, :nth-child, and so on.

By Pavel Panchekha

23 December 2021