Injection is a Display-level Problem

A perceptive HN comment recently called code injection the "NULL terminated string of the 2010s". Indeed. Sites from Facebook to GMail have at various times been vulnerable to XSS, and SQL-injection is by now a well-known vulnerability. And in this brave new world of dynamically-generated everything, malicious content might be injected into not only HTML and SQL, but also JavaScript, CSS, URLs, and more. Unfortunately, many of the wrong solutions to this problem are popular.

Popular Wrong Approaches

Distressingly popular and incorrect alternatives include:

Let's prevent people from using < and > in their names, emails, or comments.

This is the wrong solution because 1) some people consider the less than sign part of the correct spelling of their name, and won't take kindly to your claims; 2) valid emails may contain less than and greater than signs; 3) a comment might be trying to write mathematics with less- and greater-than signs; 4) HTML requires escaping more characters anyway; and also the greater problem of context-dependence that I'll discuss below.

See also: The Daily WTF on SQL Injection
Let's escape user input the moment we put it get it from the user.

This is wrong because your database is thus polluted with ugly HTML-escaped strings, which you will invariably display incorrectly or double-escape or use in the wrong context. As above, it also fails to be context-dependent.

See also: http://boourns.cjb.net/images.php?view=356
Let's escape user input the moment we get it out of the database!

So will all of your helper functions have one version to deal with escaped strings and one to deal with unescaped strings? And one to deal with JS-escaped vs. SQL-escaped vs. HTML-escaped strings? Oh, and if you're doing this, you still likely won't be clever enough to deal with nested contexts…

The Problem: Contexts

The whole idea behind injection attacks is that something was data, but it's being interpreted, thus becoming code. You usually never want anything the user did to end up as code; so the trick is that most ways of writing code have a way of escaping text so that's interpreted as plain-old data.

The problem is that most things you do involve many, many domain-specific languages which are interpreted in different ways with different escaping rules.

The other problem, one that far too few people think about, is that some text ends up being interpreted on multiple levels.

So any solution that's based on uniformly escaping all user data at a certain point will fail simply because you only know how to escape something when you're about to display it.

Injection is a display-level problem.

Here's an example — consider the following template in some stereotypical templating language.

<html>
  <head>
    <script>

      function vote(user) {
        $.post("/vote", {"user": user}, function () {
          alert("Thanks for voting");
        });
      }
    </script>
  </head>
  <body>
    <h1>$username's Profile</h1>
  </body>
</html>

This looks like a pretty normal-looking page, one with plenty of opportunities for injection attacks. But the solution cannot be so simple as HTML-escaping the username. Because consider how the template insertions will be interpreted.

In the page title, the username will be interpreted as HTML text, and thus needs to be escaped by replacing <, >, and & with <, >=, and &.
In the <script> block, the username is within a JavaScript single-quoted string inside a <script> block. Thus, we need to escape the single-quote, newline, carriage return, and U+2028/U+2029 (as well as most control characters), due to the JavaScript context, and the text </script, due to the <script> block context.
In the comments URL, the username is inside a URL inside an HTML double-quoted attribute, so we need to URL-escape the username followed by escaping any ampersands and double quotes.
In the JavaScript URL, we are within a JavaScript single-quoted string context inside a URL context inside an HTML attribute.

Note that </script is safe in the HTML attribute context, in the JavaScript string context, and in the URL context, but not in the HTML context, or the <script> tag context. Double quotes are safe in most of the contexts, but not in HTML attributes. Percent signs and slashes are safe most places, but not in URLs. And so on.

Warning

Actually, there are probably more characters you want to escape, all thanks to how screwed up browsers are; consult your neighborhood XSS expert for more. For example, I'd suggest escaping the equals, plus, and minus signs in HTML; but it all depends on your specific application.

The point is, and modern web application has the same data presented in a variety of often nested contexts, each of which has its own idiosyncratic escaping rules.

This is why you can't deal with dangerous input through validation rules, or by escaping before you put it into the database – because at that point, you don't yet know how that data will be used and what contexts it'll be placed in. Even if you do know, it'd be shortsighted to claim that the code will never be extended or modified.

Injection is a display-level problem. Solve it at the display level.

How to Do it Right

Escaping, as it happens, is a display-level solution. Now, if you're doing your displaying by concatenating strings, the best you can do probably involves writing functions html_esc, js_esc, attr_esc, url_esc, and so on. Don't make the names so long that you'll dislike writing them, and name them consistantly enough so that you're always sure whether or not you've escaped text. Then only escape text just before concatenating it into your final content; that way, you can always be sure that everything's escaped.

If you have a language with a type system or with runtime typing, you can do better by tagging strings once they're properly escaped and not displaying untagged strings. Just make sure your tags work properly with nested contexts!

If you are using a template library, you can enforce the tagging at the template level, and you also probably get a nicer syntax for escaping, so that maybe you can write

<a href="/u/${username|esc=url,attr}/comments">
  ${username|esc=html}'s comments
</a>
<a href="javascript:vote('${username|esc=js,url,attr}')">
  Vote for ${username|esc=html}
</a>

For example, Cheetah (a Python templating engine) allows you to specify your own "filter" on all interpolated values.

Finally, if you have a very smart template system, you might even be able to infer what contexts you're in automatically. But this is a pretty hard problem, requiring your template system to know, for example, that the href attribute is a URL and that <script> tags delimit JavaScript in a <script> context. I can't actually think of an example of a template engine this smart, but I'm sure they exist.

In a Few Words

If you take away only two things from this, then take away this: injection is a display-level problem, and it should be solved in a display-level way: with context-specific and nested-context-aware escaping functions.

By Pavel Panchekha

22 August 2011