Cross-site scripting (XSS) is one of the most common methods hackers use to attack websites. XSS vulnerabilities permit a malicious user to execute arbitrary chunks of JavaScript when other users visit your site.
XSS is the most common publicly reported security vulnerability, and part of every hacker’s toolkit.
Risks
What could a determined hacker do when exploiting a XSS vulnerability?
XSS allows arbitrary execution of JavaScript code, so the damage that can be done by an attacker depends on the sensitivity of the data being handled by your site. Some of the things hackers have done by exploiting XSS:
- Spreading worms on social media sites. Facebook, Twitter and YouTube have all been successfully attacked in this way.
- Session hijacking. Malicious JavaScript may be able to send the session ID to a remote site under the hacker’s control, allowing the hacker to impersonate that user by hijacking a session in progress.
- Identity theft. If the user enters confidential information such as credit card numbers into a compromised website, these details can be stolen using malicious JavaScript.
- Denial of service attacks and website vandalism.
- Theft of sensitive data, like passwords.
- Financial fraud on banking sites.
Protection
To protect against stored XSS attacks, make sure any dynamic content coming from the data store cannot be used to inject JavaScript on a page.
Escape Dynamic Content
Web pages are made up of HTML, usually described in template files, with dynamic content woven in when the page is rendered. Stored XSS attacks make use of the improper treatment of dynamic content coming from a backend data store. The attacker abuses an editable field by inserting some JavaScript code, which is evaluated in the browser when another user visits that page.
Unless your site is a content-management system, it is rare that you want your users to author raw HTML. Instead, you should escape all dynamic content coming from a data store, so the browser knows it is to be treated as the contents of HTML tags, as opposed to raw HTML.
Escaping dynamic content generally consists of replacing significant characters with the HTML entity encoding:
< | < |
---|---|
> | > |
& | & |
" | " |
' | ' |
Most modern frameworks will escape dynamic content by default – see the code samples below for details.
Escaping editable content in this way means it will never be treated as executable code by the browser. This closes the door on most XSS attacks.
Allowlist Values
If a particular dynamic data item can only take a handful of valid values, the best practice is to restrict the values in the data store, and have your rendering logic only permit known good values. For instance, instead of asking a user to type in their country of residence, have them select from a drop-down list.
Implement a Content-Security Policy
Browsers support Content-Security Policies
that allow the author of a web-page to control where JavaScript (and other resources)
can be loaded and executed from. XSS attacks rely on the attacker being
able to run malicious scripts on a user’s web page - either by
injecting inline <script>
tags somewhere within the <html>
tag of a
page, or by tricking the browser into loading the JavaScript from a
malicious third-party domain.
By setting a content security policy in the response header, you can tell the browser to never execute inline JavaScript, and to lock down which domains can host JavaScript for a page:
Content-Security-Policy: script-src 'self' https://apis.google.com |
---|
By listing the URIs from which scripts can be loaded, you are implicitly stating that inline JavaScript is not allowed. |
The content security policy can also be set in a <meta>
tag in the <head>
element of the page:
<meta http-equiv="Content-Security-Policy"
content="script-src 'self' https://apis.google.com">
This approach will protect your users very effectively! However, it may take a considerable amount of discipline to make your site ready for such a header. Inline scripts tags are considered bad practice in modern web-development - mixing content and code makes web-applications difficult to maintain - but are common in older, legacy sites.
To migrate away from inline scripts incrementally, consider makings use of
CSP Violation Reports.
By adding a report-uri
directive in your policy header, the browser will
notify you of any policy violations, rather than preventing inline JavaScript
from executing:
Content-Security-Policy-Report-Only: script-src 'self'; report-uri https://example.com/csr-reports |
---|
This will give you reassurance that there are no lingering inline scripts, before you ban them outright.
Sanitize HTML
Some sites have a legitimate need to store and render raw HTML. If your site stores and renders rich content, you need to use a HTML sanitization library to ensure malicious users cannot inject scripts in their HTML submissions.
Code Samples
Preventing XSS vulnerabilities requires using the right code libraries, and performing thorough code reviews. Below are some examples of what to look out for when checking your code.
Python
Django
Templates in Django escape HTML by default, so anything that looks like the following is generally safe:
**{{ contents }}**
You can override escape by using the
| safe
filter. There are often good reasons to do this, but you will need to conduct code
reviews on anything that uses this command:
**{{ contents | safe }}**
Note that HTML-escaping can also be turned on or off with the
{% autoescape %}
tag.
Flask
Flask templates escape HTML by default, so code that looks like the following is generally safe:
<ul id="navigation">
{% for item in navigation %}
<li><a href="{{ item.href }}">{{ item.caption }}</a></li>
{% endfor %}
</ul>
However, escaping can be turned off by using the safe
keyword:
<ul id="navigation">
{% for item in navigation %}
<li><a href="{{ item.href }}">{{ item.caption | safe }}</a></li>
{% endfor %}
</ul>
Or enclosing everything in an autoescape false
block:
{% autoescape false %}
<ul id="navigation">
{% for item in navigation %}
<li><a href="{{ item.href }}">{{ item.caption }}</a></li>
{% endfor %}
</ul>
{% endautoescape %}
Be sure to conduct code reviews on any templates that use these features!
Ruby
Rails
Rails templates escape HTML by default, so anything that looks like the following is generally safe:
<%= contents %>
You can override escape by using the
raw
function, or using the <%==
operator. There are often good reasons to do this,
but you will need to conduct code reviews on anything that uses these functions:
<%= raw contents %>
<%== contents %>
Java
Java Server Pages
Use the c:out
tag to safely escape HTML:
<c:out value="${contents}">
The following ways of writing to a template do not escape HTML, so you should use them with care:
<%= contents %>
${contents}
<%
out.println(contents);
%>
Consider using a third-party library if you need to escape HTML manually.
C#
ASP.NET
Use either of the following functions to safely escape HTML (the <%:
form was introduced in ASP.NET 4.0):
<%= HttpUtility.HtmlEncode(contents) %>
<%: contents %>
The following way of writing to a template does not escape HTML automatically, so you should use them with care:
<%= contents %>
Use HttpUtility.HtmlEncode(...)
if you need to escape HTML manually.
Node
Mustache.js
Tags in double mustaches automatically escape HTML:
{{ contents }}
Tags in triple mustaches do not escape HTML, and should be used with care:
{{{ contents }}}
Dust.js
Key tags automatically escape HTML:
{ contents }
However, escaping can be disabled with the |s
operator,
so use this wth care
{ contents | s }
Nunjucks
If auto-escaping is turned on in the environment, Nunjucks will automatically escape tags for safe output:
{{ contents }}
Content marked with the safe filter will not be escaped – use this function with care:
{{ contents | safe }}
Auto-escaping can be disabled for a template, in which case tags need to be escaped manually:
{{ contents | escape }}
PHP
The echo
command does not escape
HTML by default, which means that any code like the following, which pulls data
directly out of the HTTP request, is vulnerable to XSS attacks:
<?php
echo $_POST["comment"];
?>
Be sure to use the strip_tags
function or the
htmlspecialchars
function to safely escape
parameters:
<?php
echo strip_tags($_POST["comment"]);
?>
AngularJS
In Angular any dynamic content written out in curly brackets will automatically be escaped, so the following is safe:
<div>{{dynamicContent}}</div>
Be wary of any code that binds dynamic content to the innerHTML
attribute
since that will not be escaped automatically:
<div [innerHTML]="dynamicContent"></div>
<div innerHTML="{{dynamicContent}}"></div>
React
In React any dynamic content written out in curly brackets will automatically be escaped, so the following is safe:
render() {
return <div>{dynamicContent}</div>
}
React allows you write out raw HTML by binding content to the
dangerouslySetInnerHTML
property, which is named to remind you of the
security risk! Watch out for any code that looks like the following:
render() {
return <div dangerouslySetInnerHTML={ __html: dynamicContent } />
}
Other Considerations
HTTP-only Cookies
Our example hack shows how a session-hijacking attack can use malicious JavaScript to steal the cookie containing the user’s session ID. There is rarely a good reason to read or manipulate cookies in client-side JavaScript, so consider marking cookies as HTTP-only, meaning that cookies will be received, stored, and sent by the browser, but cannot be modified or read by JavaScript.
Further Reading
- How Cross-site Scripting Works
- Content security policy
- CSP (Content Security Policy) on the Mozilla Developer Network
- Content Security Policy Explained