Go Safe HTML

2020-06-30 - Rob

Disclaimer: this is not an official Google post or communication, it is just me commenting on something that is now available.

The Google infosec team just released the Go “safehtml” package. If you are interested in making your application resilient to server-side XSS you might want to adopt it instead of “html/template”. Migration to safehtml should be quite straightforward as it is just a hardened fork of the original html/template standard package. If you don’t have major flaws in your app it should not be too complicated to convert it to use the safe version.

This is what Google uses internally to protect products from XSS.

If you just want to use it without reading through the explanation you can jump to the checklist.

Issues with “html/template”

“html/template” has no concept of “tainting” and does not keep track on how dangerous types like template.HTML are constructed. There is just a line in the documentation stating:

Use of this type presents a security risk: the encapsulated content should come from a trusted source, as it will be included verbatim in the template output.

This not only is lacking explanation on the “why” and the “how” to use the type, but it is also very commonly used, making it a very dangerous pitfall together with all the other types that have that line in the doc (seven in total).

Moreover “html/template” has some standing issues that don’t have a good way to be properly fixed without breaking backward compatibility. The tradeoff between breakage potential and security benefit there is not clear, so if you want to opt in to be more secure, you should probably abandon “html/template”.

Note that I am maintaining “html/template” (I’m empijei here), so I am telling you this with a bit of background of being bitten by that package. If I could magically migrate all users to the safe version I would.

The structure

Safehtml is composed of several packages. Depending on your build system or your company toolchain there are some constraints that you should enforce. Your company’s security team or security-aware folks should be setting this up for everyone.

safehtml

This is the root package and it is just providing types that are safe by construction. Here is how it works in a nutshell:

This guarantees that every single instance of the HTML type is known to be safe. The Script type behaves similarly but instead of having templates it can only be built from constants or data. To express the concept of a “compile time constant” it has constructors that take unexported string types, so the only way to call them is with a string literal (I find this to be a very neat trick).

All other types in this package follow a similar pattern.

safehtml/template

This is your real “html/template” replacement and it is the package everyone should be using. If “legacyconversions” and “uncheckedconversions” (read below) are not used and all your HTML responses are generated by this package you have the guarantee there is not going to be any server-side XSS in your products.

We are working on tools that ensure this last condition is true, but it will take a bit of time. Stay tuned for updates.

safehtml/legacyconversions

This package should only be used to transition to the safe API. It blesses any arbitrary string to be a safe type so that transition to safehtml can be very quick and all new code will be safe. Once the migration has happened the use of this package should be prevented. As the name states: this is just for legacy code, no new code should be using it, and all usages of this packages should gradually be refactored to use safe constructors instead.

safehtml/testconversions

This package should only be used in test targets and only when necessary. You should set up some linters that make sure of it.

safehtml/uncheckedconversions

This is the most nuanced matter. Sometimes the safehtml API is too inconvenient or even impossible to use. Sometimes you have to use unsafe patterns because you want to do something that cannot be proven to be safe (e.g. take some HTML that you trust from a database and serve it to the client).

For these very rare situations you can use this package. Importing it should be restricted to a hand-picked set of dependants and every new import should require some security-aware folks to review it.

Make sure that the usage is safe and will stay safe, as uncheckedconversions do not increase safety. They are just there to inform the compiler that you have reviewed the code and want it to be trusted. Follow these guidelines:

  • use only if strictly necessary (e.g. if using safehtml/template requires a bit more work but does the job, do the extra work)
  • document why the usage is safe for future reviewers and maintainers
  • narrow down the context by reducing dependency of the uncheckedconversion on enclosing function arguments, struct fields that can be arbitrarily modified and so on

Usages of this package are your single point of failure, so make sure you follow these. (This sentence assumes you will eventually get rid of legacy conversions)

One example of a correct use of this package would be for the output of a sanitizer. If you need to have some HTML that your users provide to be embedded in a response (e.g. because you render markdown or you have a webmail) you will sanitize that HTML. Once it is sanitized (if your sanitizer is correctly implemented) it should be okay to use an unchecked conversion to promote it to the HTML type.

safehtml/raw

Importing this package should be prevented. Anything outside of the “safehtml/” directory tree should not have visibility of this package.

safehtml/safehtmlutil

Yes, I know, not a good name. Consider that this package, like the previous one, should also not be imported outside of safehtml and it was just created to reduce code duplication and avoid cyclic dependencies. I agree this could have been named or structured differently, but since you’re never going to interact with this package it should not bother you too much.

How to do the refactor

Printf and nested templates

One example of code that you might have is

var theLink template.HTML = fmt.Sprintf("<a href=%q>the link</a>", myLink)
myTemplate.Execute(httpResponseWriter, theLink)

To refactor this you have multiple options: you either build the string with another template (note that here “template” is “safehtml/template”)

myLinkTpl := template.Must(template.New("myUrl").Parse("<a href={{.}}>the link</a>"))
theLink, err := myLinkTpl.ExecuteToHtml(myLink)
// handle err
myTemplate.Execute(httpResponseWriter, theLink)

or, for more complex cases, you can use nested templates:

const outer = `<h1> This is a title <h2> {{ template "inner" .URL }}`
const inner = `<a href="{{.}}">the link</a>`
t := template.Must(template.New("outer").Parse(outer))
t = template.Must(t.New("inner").Parse(inner))
t.ExecuteTemplate(os.Stdout, "outer", map[string]string{"URL": myLink})

Constants

If you have an HTML const in your code, you can just use it as a template and execute it to html. This will check that all tags are balanced and other things and return an instance of the HTML type.

So this

var myHtml template.HTML := `<h1> This is a title </h1>`

becomes this

myHtml := template.MustParseAndExecuteToHTML(`<h1> This is a title </h1>`)

The checklist

  1. Block access to some packages:
  • Prevent packages outside of the “safehtml” directory from importing “raw”, “uncheckedconversions” and “safehtmlutil”
  • Only allow test builds to import the “testconversions” package.
  1. Migrate away from “html/template” and replace it with “safehtml/template”
  • For every breakage or every issue, use a “legacyconversions” call. Some manual refactoring might be needed, but migration should be fairly straightforward
  • RUN ALL YOUR INTEGRATION AND E2E TESTS. This is important, so important I used SHIFT and not CAPS to type it
  • Block down the list of legacy conversions: from this moment on new imports of the “legacyconversions” package are forbidden
  • Ban use of “html/template”, so that all new code is safe
  1. Refactor legacy conversions to use safe patterns
  • Wherever possible construct HTML in a safe way and remove the legacy conversions
  • Where it is not possible use unchecked conversions. Every new import of the “uncheckedconversions” package should be reviewed.

Conclusions

If you want to make sure you don’t have server-side XSS in your Go code this probably is the best way to do so. If you have any questions or need more refactoring examples please let me know, you can contact me on twitter (direct messages are open) or via email.