Protecting Your Users Against Cross-site Scripting

Cross-site scripting (XSS) is one of the most common methods hackers use to attack websites. XSS vulnerabilities permit a malicious user to execute arbitrary chunks of JavaScript when other users visit your site.

XSS is the most common publicly reported security vulnerability, and part of every hacker’s toolkit.

Risks

Prevalence Common
Exploitability Easy
Impact Harmful

What could a determined hacker do when exploiting a XSS vulnerability?

XSS allows arbitrary execution of JavaScript code, so the damage that can be done by an attacker depends on the sensitivity of the data being handled by your site. Some of the things hackers have done by exploiting XSS:

  • Spreading worms on social media sites. Facebook, Twitter and YouTube have all been successfully attacked in this way.
  • Session hijacking. Malicious JavaScript may be able to send the session ID to a remote site under the hacker’s control, allowing the hacker to impersonate that user by hijacking a session in progress.
  • Identity theft. If the user enters confidential information such as credit card numbers into a compromised website, these details can be stolen using malicious JavaScript.
  • Denial of service attacks and website vandalism.
  • Theft of sensitive data, like passwords.
  • Financial fraud on banking sites.

Protection

To protect against stored XSS attacks, make sure any dynamic content coming from the data store cannot be used to inject JavaScript on a page.

Escape Dynamic Content

Web pages are made up of HTML, usually described in template files, with dynamic content woven in when the page is rendered. Stored XSS attacks make use of the improper treatment of dynamic content coming from a backend data store. The attacker abuses an editable field by inserting some JavaScript code, which is evaluated in the browser when another user visits that page.

Unless your site is a content-management system, it is rare that you want your users to author raw HTML. Instead, you should escape all dynamic content coming from a data store, so the browser knows it is to be treated as the contents of HTML tags, as opposed to raw HTML.

Escaping dynamic content generally consists of replacing significant characters with the HTML entity encoding:

< &#60
> &#62
& &#38
" &#34
' &#39

Most modern frameworks will escape dynamic content by default – see the code samples below for details.

Escaping editable content in this way means it will never be treated as executable code by the browser. This closes the door on most XSS attacks.

Allowlist Values

If a particular dynamic data item can only take a handful of valid values, the best practice is to restrict the values in the data store, and have your rendering logic only permit known good values. For instance, instead of asking a user to type in their country of residence, have them select from a drop-down list.

Implement a Content-Security Policy

Browsers support Content-Security Policies that allow the author of a web-page to control where JavaScript (and other resources) can be loaded and executed from. XSS attacks rely on the attacker being able to run malicious scripts on a user’s web page - either by injecting inline <script> tags somewhere within the <html> tag of a page, or by tricking the browser into loading the JavaScript from a malicious third-party domain.

By setting a content security policy in the response header, you can tell the browser to never execute inline JavaScript, and to lock down which domains can host JavaScript for a page:

Content-Security-Policy: script-src 'self' https://apis.google.com
By listing the URIs from which scripts can be loaded, you are implicitly stating that inline JavaScript is not allowed.

The content security policy can also be set in a <meta> tag in the <head> element of the page:

<meta http-equiv="Content-Security-Policy" 
      content="script-src 'self' https://apis.google.com">

This approach will protect your users very effectively! However, it may take a considerable amount of discipline to make your site ready for such a header. Inline scripts tags are considered bad practice in modern web-development - mixing content and code makes web-applications difficult to maintain - but are common in older, legacy sites.

To migrate away from inline scripts incrementally, consider makings use of CSP Violation Reports. By adding a report-uri directive in your policy header, the browser will notify you of any policy violations, rather than preventing inline JavaScript from executing:

Content-Security-Policy-Report-Only: script-src 'self'; report-uri http://example.com/csr-reports

This will give you reassurance that there are no lingering inline scripts, before you ban them outright.

Sanitize HTML

Some sites have a legitimate need to store and render raw HTML. If your site stores and renders rich content, you need to use a HTML sanitization library to ensure malicious users cannot inject scripts in their HTML submissions.

Code Samples

Preventing XSS vulnerabilities requires using the right code libraries, and performing thorough code reviews. Below are some examples of what to look out for when checking your code.

Django

Templates in Django escape HTML by default, so anything that looks like the following is generally safe:


**{{ contents }}**

You can override escape by using the | safe filter. There are often good reasons to do this, but you will need to conduct code reviews on anything that uses this command:


**{{ contents | safe }}**

Note that HTML-escaping can also be turned on or off with the {% autoescape %} tag.

Flask

Flask templates escape HTML by default, so code that looks like the following is generally safe:


<ul id="navigation">
  {% for item in navigation %}
    <li><a href="{{ item.href }}">{{ item.caption }}</a></li>
  {% endfor %}
</ul>
    

However, escaping can be turned off by using the safe keyword:


<ul id="navigation">
  {% for item in navigation %}
    <li><a href="{{ item.href }}">{{ item.caption | safe }}</a></li>
  {% endfor %}
</ul>

Or enclosing everything in an autoescape false block:

{% autoescape false %}
  <ul id="navigation">
    {% for item in navigation %}
      <li><a href="{{ item.href }}">{{ item.caption }}</a></li>
    {% endfor %}
  </ul>
{% endautoescape %}

Be sure to conduct code reviews on any templates that use these features!

Rails

Rails templates escape HTML by default, so anything that looks like the following is generally safe:


<%= contents %>

You can override escape by using the raw function, or using the <%== operator. There are often good reasons to do this, but you will need to conduct code reviews on anything that uses these functions:


<%= raw contents %>

<%== contents %>

Java Server Pages

Use the c:out tag to safely escape HTML:


<c:out value="${contents}">

The following ways of writing to a template do not escape HTML, so you should use them with care:


<%= contents %>

${contents}

<%
   out.println(contents);
%>

Consider using a third-party library if you need to escape HTML manually.

ASP.NET

Use either of the following functions to safely escape HTML (the <%: form was introduced in ASP.NET 4.0):


<%= HttpUtility.HtmlEncode(contents) %>

<%: contents %>

The following way of writing to a template does not escape HTML automatically, so you should use them with care:


<%= contents %>

Use HttpUtility.HtmlEncode(...) if you need to escape HTML manually.

Mustache.js

Tags in double mustaches automatically escape HTML:


{{ contents }}

Tags in triple mustaches do not escape HTML, and should be used with care:


{{{ contents }}}

Dust.js

Key tags automatically escape HTML:


{ contents }

However, escaping can be disabled with the |s operator, so use this wth care


{ contents | s }

Nunjucks

If auto-escaping is turned on in the environment, Nunjucks will automatically escape tags for safe output:


{{ contents }}

Content marked with the safe filter will not be escaped – use this function with care:


{{ contents | safe }}

Auto-escaping can be disabled for a template, in which case tags need to be escaped manually:


{{ contents | escape }}

The echo command does not escape HTML by default, which means that any code like the following, which pulls data directly out of the HTTP request, is vulnerable to XSS attacks:


<?php
  echo $_POST["comment"];
?>

Be sure to use the strip_tags function or the htmlspecialchars function to safely escape parameters:


<?php
  echo strip_tags($_POST["comment"]);
?>

In Angular any dynamic content written out in curly brackets will automatically be escaped, so the following is safe:

  <div>{{dynamicContent}}</div>

Be wary of any code that binds dynamic content to the innerHTML attribute since that will not be escaped automatically:

  <div [innerHTML]="dynamicContent"></div>
  <div innerHTML="{{dynamicContent}}"></div>

In React any dynamic content written out in curly brackets will automatically be escaped, so the following is safe:

render() {
  return <div>{dynamicContent}</div>
}

React allows you write out raw HTML by binding content to the dangerouslySetInnerHTML property, which is named to remind you of the security risk! Watch out for any code that looks like the following:

render() {
  return <div dangerouslySetInnerHTML={ __html: dynamicContent } />
}

Other Considerations

HTTP-only Cookies

Our example hack shows how a session-hijacking attack can use malicious JavaScript to steal the cookie containing the user’s session ID. There is rarely a good reason to read or manipulate cookies in client-side JavaScript, so consider marking cookies as HTTP-only, meaning that cookies will be received, stored, and sent by the browser, but cannot be modified or read by JavaScript.

Further Reading