Remote Code Execution

A Remote Code Execution (RCE) vulnerability allows an attacker to execute arbitrary code in a vulnerable system. On a web-server, RCE vulnerabilities are typically caused when untrusted input from the HTTP request is evaluated as code.

Risks

Prevalence Occasional
Exploitability Easy
Impact Devastating

An attacker who is able to supply code in an HTTP request and execute it on your server will be able to delete files, read sensitive data, install malware, and perform other malicious acts.

Dynamic Code Execution

Most programming languages allow you to execute a string held in memory as code. Below are some illustrations of dynamic execution in various programming languages.

Python code can be evaluated dynamically using the global eval(...) and exec(...) functions. The former expects a string representing a (single) valid Python expression, while the later can execute multiple expressions - making it able to create new module, class, and function definitions. Both functions have access to the global and local state at the point of invocation.

In JavaScript you can dynamically evaluata code using the eval(...) function. For instance, the following JavaScript code will print “WARNING” to the console:

  eval("console.log('WARNING')")

Input strings can also be dynamically evaluated as code in the Node.js runtime using the Function(...) or Script(...) constructors. There are various subtle ways of doing this, as illustrated by the following code:

const arbitraryObject = {},
      a = 'constructor',
      b = 'constructor',
      s = 'console.log("Hacked!")';

// This will print the phrase 'Hacked!' to the console.
arbitraryObject[a][b](s)()

These two lines of code illustrate an RCE vulnerability. Properties can be looked up on a JavaScript object by name, and this includes functions. The illustrated exploit gets the constructor function for an arbitrary object, looks up rhe constructor function for the constructor - which is actually Function(...) - passes it a command string, then executes that string.

Although this example is contrived, it illustrates how any code that has the form:

arbitraryObject[parameterA][parameterB](parameterC)()

…enables a remote execution attack if all the parameters can be controlled by an attacker.

Ruby code can be evaluated dynamically using the eval(...) function of the Kernel module, which is included on the Object class. This means it is available on all objects. Note that string passed to eval(...) can redefine existing methods, class and modules in the codebase.

Dynamic evaluation of code in Java can be achieved in a number of ways. The Java scripting API (defined in the javax.script page) package allows you to evaluate code in a number of different scripting languages via the ScriptEngineManager class. There are, for instance, JVM compatible versions of Python (called Jython) and Ruby (called JRuby) that can be loaded as jars and invoked via the script manager.

The Java compiler can also be invoked programmatically via the javax.tools.JavaCompiler interface. This approach is often used by build tools, which convert files on disk to runnable code that can be loaded into the JVM.

Finally, there are dedicated libraries that allow dynamic evaluation of code. The Apache Bean Scripting Framework supports a range of scripting languages, and dynamic languages like Groovy can be added to the Java runtime by simply adding a jar in the classpath.

C# code can be evaluated dynamically using the Code Document Object Model (CodeDOM) module, defined in the System.CodeDom namespace. CodeDOM allows for templated code generation and dynamic compilation within the .NET runtime.


Safely Executing Dynamic Code

With all of these approaches, it is of vital importance that you only pass trusted strings to the code execution API. Passing untrusted content will allow an attacker to execute arbitrary code in your web-server - meaning they can steal sensitive information, read files from disk, or install malware.

To secure your code, you should ensure any strings passed to a code execution API are taken from a trusted source, or validated against an allow-list. Here’s how to do that in Java, for example:

protected void doPost(HttpServletRequest request, HttpServletResponse response) throws IOException 
{
  String code = request.getParameter("script");

  // Since we are taking the code to execute from an untrusted source (the HTTP
  // request) we *must* validate against an allowed list of scripts.      
  if (!allowedCommands.contains(code))
    throw new IOException("Not permitted");

  ScriptEngineManager manager = new ScriptEngineManager();
  ScriptEngine        engine  = manager.getEngineByName("JavaScript");
  engine.eval(input);
  
  response.setStatus(200);
}

Domain Specific Languages

Dynamic code evaluation is often used where there is a need to develop a Domain Specific Language (DSL) for expert users. In such a scenario, it is preferable to execute DSL statements in a sandbox with limited permissions. (For instance, you probably want to limit disk and network access if you are simply evaluating mathematical statements.)

One way to achieve this is by formally defining the grammar of the DSL, and using a tool to generate a parser that can evaluate scripts at runtime. Since the grammar specification defines which functions are executable, you have full control over what the DSL can do.

Here’s some approaches to creating DSLs in various languages:

The Abstract Syntax Tree (ast) module helps to safely process abstract syntax grammar. Below is an example of how to use it:

import ast, operator, math

def safe_eval(expression):
  """Evaluate a simple mathematical function by parsing an expression into an
  abstract syntax tree and safely evaluating each node in turn."""

  def find_math_function(x, *args):
    if x not in [x for x in dir(math) if not "__" in x]:
      raise SyntaxError(f"Unknown function {x}()")

    return getattr(math, x)

  binary_operations = {
    ast.Add:   operator.add,
    ast.Sub:   operator.sub,
    ast.Mult:  operator.mul,
    ast.Div:   operator.truediv,
    ast.Mod:   operator.mod,
    ast.Pow:   operator.pow,
    ast.Call:  find_math_function,
    ast.BinOp: ast.BinOp,
  }

  unary_operations = {
    ast.USub:    operator.neg,
    ast.UAdd:    operator.pos,
    ast.UnaryOp: ast.UnaryOp,
  }

  operations = tuple(binary_operations) + tuple(unary_operations)

  syntax_tree = ast.parse(expression, mode='eval')

  def _eval(node):
    if isinstance(node, ast.Expression):
      return _eval(node.body)
    elif isinstance(node, ast.Str):
      return node.s
    elif isinstance(node, ast.Num):
      return node.value
    elif isinstance(node, ast.Constant):
      return node.value
    elif isinstance(node, ast.BinOp):
      if isinstance(node.left, operations):
        left = _eval(node.left)
      else:
        left = node.left.value
      if isinstance(node.right, operations):
        right = _eval(node.right)
      else:
        right = node.right.value
      return binary_operations[type(node.op)](left, right)
    elif isinstance(node, ast.UnaryOp):
      if isinstance(node.operand, operations):
        operand = _eval(node.operand)
      else:
        operand = node.operand.value
      return unary_operations[type(node.op)](operand)
    elif isinstance(node, ast.Call):
      args = [_eval(x) for x in node.args]
      function = find_math_function(node.func.id)
      return function(*args)
    else:
      raise SyntaxError(f"Bad syntax, {type(node)}")

  return _eval(syntax_tree)

safe_eval("1 + 1")       # Will return 2
safe_eval("(100*10)+6")  # Will return 1006
safe_eval("sqrt(16)+1")  # Will return 5.0

The Java Compiler Compiler (JavaCC) and ANTLR projects allow developer to generate parsers that can evaluate scripts at runtime. Since the grammar specification defines which functions are executable, you have full control over what the DSL can and can’t do.

If you find yourself writing a DSL in .NET, consider using one of the following solutions:

  • Microsoft’s DSL Tools, that come bundled with Visual Studio and allow the visual creation of DSLs.
  • A scripting language like Lua that can be dynamically loaded and executed within the .NET runtime.
  • ANTLR, a parser generator that can be used to generate and execute sand-boxed languages.

Further Reading