Perspective

How to Evaluate an SCA with Reachability: Benchmarking Hard-to-analyze Language Features

Learn how to effectively assess the accuracy, and consequently the trustworthiness, of a reachability analysis.

How to Evaluate an SCA with Reachability: Benchmarking Hard-to-analyze Language Features

Written by

Martin Torp

CPO, Co-founder

Industry

No items found.

Location

Number of engineers

Programming languages

No items found.

When searching for an SCA with reachability, one of the most important aspects to consider is the accuracy of the reachability analysis. In this post, we provide an overview of how to evaluate this accuracy. Although the discussion focuses on static reachability, most of the concepts apply to dynamic analyses as well. We recommend choosing a static reachability provider due to foundational limitations in dynamic analysis (a topic we cover in this post).

This post also primarily focuses on the most precise form of reachability, known as function-level reachability, which involves identifying calls to functions affected by vulnerabilities. There are also less precise, more coarse-grained forms of reachability, such as module-level reachability, which only check if modules affected by vulnerabilities are loaded. It is also a good idea to test the accuracy of these other coarse-grained analyses using similar principles as described below.

There are two constituents determining the accuracy of an analysis; the false positive rate and the false negative rate.

False Positives

The false positive rate indicates how many vulnerabilities classified as reachable are actually unreachable. This rate is typically less concerning for two reasons. First, even if the false positive rate is relatively high, it is still much better than not having any reachability analysis. Second, comparing the false positive rates between tools is straightforward: simply compare the fraction of vulnerabilities marked as unreachable.

False Negatives

The false negative rate, which indicates how many vulnerabilities classified as unreachable are actually reachable, is more concerning. Ignoring a vulnerability deemed unreachable when it is, in fact, reachable and potentially exploitable, poses a significant risk. Unlike the false positive rate, accessing the false negative rate is challenging. It requires manually determining how packages containing unreachable vulnerabilities are used and checking if the vulnerable part of a package is indeed unused—an impractical task for large projects with many vulnerabilities.

Evaluating the False Negative Rate

To approximate the false negative of an analysis, we recommend testing whether the analysis can correctly identify reachable vulnerabilities hidden behind hard-to-analyze features. Essentially, stress-testing the analysis by providing it with example inputs, where you would expect the analysis to potentially misclassify reachable vulnerabilities as unreachable.

If the analysis performs well on a few examples, it is likely that it will also perform well in general.

Let’s consider an example of a reachable vulnerability, which may prove challenging for a reachability analysis to identify.

A Tricky Vulnerability

Consider the following definition of JavaScript module that exports the class A affected by the vulnerability CVE-2022-0155.

1const httpFollow = require("follow-redirects").http;
2const urlModule = require("url");
3
4class A {
5  constructor() {}
6
7  request(method, url, options) {
8    options = { ...options, ...urlModule.parse(url), method };
9    const req = httpFollow.request(options);
10    req.end();
11  }
12}
13
14["delete", "get", "head"].forEach(function each (method) {
15  A.prototype[method] = function requestWrapper (url, options) {
16    this.request(method, url, options);
17  };
18});
19
20module.exports.A = A;

A.js

The vulnerability lies in the use of follow-redirects on line 9. The follow-redirects package is for sending HTTP requests that are redirected if the server responds with a 3XX redirect HTTP response code. However, prior to version 1.14.7, follow-redirects suffered from a vulnerability where Cookie headers included in the initial request would also be included in the redirected request. If an attacker could control a redirect from https://example.com to https://attackers-webserver.com, they would be able to steal sensitive session data and gain full access to the user’s account on https://example.com.

The Reachability Challenge

For a reachability analysis to determine if an application is affected by the vulnerability in follow-redirects, it needs to determine whether the application utilizes the request function from follow-redirects. There are two important aspects to remember about this challenge:

  • It is not enough to only look for usages of follow-redirects in the application code. The dependency code must be analyzed as well. For example, this particular vulnerability was exploitable through the package axios, which depends on follow-redirects. You can find a PoC of the exploit here.
  • It is not sufficient to grep for calls to follow-redirect’s request function in dependencies. Doing so would likely identify many usages of request that are not reachable from the application and thus not exploitable leading to a high false positive rate.

With that in mind, let’s consider the challenge an analysis faces in identifying the reachability of this particular vulnerability. The code in A.js may seem obscure, but it illustrates a common pattern in JavaScript. The A module is a based on Axios, a popular npm package for executing HTTP requests. Although Axios’ code is more complex, this simplified example still highlights the challenge (see the code from Axios here).

On line 14 to 18 in A.js, the methods delete, get, and head are defined by assigning function properties to A.prototype, which is equivalent to the definitions in A-inline-definitions.js below:

1class A {
2  ...
3  
4  delete(url, options) {
5	return this.request('delete', url, options);
6  }
7
8  get(url, options) {
9	return this.request('get', url, options);
10  }
11	
12  head(url, options) {
13	return this.request('head', url, options);
14  }
15}

A-inline-definitions.js

However, because the bodies of these methods are identical except for the method argument passed to request, it is possible to avoid code duplication by iterating through the list of method names and assigning the functions to A.prototype as on line 14 to 18 in A.js.

Let’s consider a small program that utilizes the A class (here we imagine that get is defined as on line 14 to 18 in A.js, i.e., using the iterator).

1const { A } = require("A");
2const a = new A();
3a.get("https://www.coana.tech", {
4  headers: { cookie: "session=some-secret-value" },
5});

client.js

A static analysis can easily determine that A on line 1 in a-client.js is the class defined on line 4 to 12 in A.js. It is also straightforward for it to map the call on line 2 in client.js to the constructor defined on line 5 in A.js. The challenge arises when the analysis needs to determine where the get function, called on line 3 in client.js, is defined. The called function is of course requestWrapper defined on line 15 to 17 in A.js, but that is non-trivial for a static analysis to determine. The analysis needs to infer that the parameter named method holds the value 'get' in one of assignments occurring on line 15. However, when static analyses analyze functions, they usually combine arguments from multiple calls through an abstraction process. Instead of analyzing the body of each 3 times with method set to 'delete', 'get' and 'head', a static analysis is likely to just analyze the body once with method set to an abstract value. For example, an abstract value could be any string, representing any possible string value.

Abstractions are necessary to ensure static analyses do not run indefinitely. In the example above, an analysis without a well-designed abstraction is likely to either conclude that it does not know which function is called on line 3 in client.js or that it could be any function present on A.prototype. In the former case, the analysis will incorrectly mark the vulnerability as unreachable (a false negative). In the latter case, the analysis may correctly mark the vulnerability as reachable, but such a design is likely to lead to a large number of false positives and potential scalability problems for larger programs.

Creating a Benchmark Suite

The example above can be used to test the false negative rate of a reachability analysis, as it includes a dynamic property write, which is challenging for static analysis to model accurately.

However, to thoroughly investigate the false negative rate of an analysis, it is beneficial to use several examples from each of your programming languages. While you can pick examples from your codebase, for more well-designed inputs, you need to think about what features from your language of choice may prove challenging for a static analysis. As a general rule, features that are hard for humans to understand are probably also challenging for a static analysis. For example, in Java, the reflection API may serve as a hard-to-analyze feature.

It is important to note that while there are open-source benchmark suites available, vendors may optimize their tools to perform well on these benchmarks, potentially providing an inaccurate representation of how the analysis handles real-world programs.

How Coana Designs Reachability Analyses

Coana’s reachability analysis for JavaScript can correctly classify the vulnerability in follow-redirects as reachable, even when it is exposed through the more complex axios package, rather than the simplified A package discussed earlier. To achieve this, Coana incorporates ideas from the latest academic research in static analysis. For example, see the paper Reducing Static Analysis Unsoundness with Approximate Interpretation.

In general, Coana builds on 15 years of world-leading academic research on static program analysis, and the reachability engines used by Coana combine several techniques and ideas to achieve unprecedented accuracy:

  • Solid Classic Analysis Algorithms: Efficient subset-based, points-to and control-flow analyses track the flows of functions and objects within the analyzed program, including its dependencies.
  • Accurate Models: Detailed models of language features and standard library functions ensure the analyses handle complex programs.
  • Fine-Tuned Heuristics and Algorithms: To ensure accurate results for large real-world programs that use advanced features such as dynamic property writes and reflection.

You can learn more about Coana’s approach to reachability analysis here and see how Coana compares to other static reachability providers here.

Conclusion

The aim of this post has been to demonstrate the challenges that reachability analyses face when reasoning about complex but realistic programs and to outline what is necessary to evaluate the accuracy of a reachability analysis.

When choosing a reachability analysis provider, it is recommended that you test the accuracy (false positive and false negative rate) of the analysis on benchmarks that use non-trivial language features. Many vendors claim some form of reachability support, but our experience shows that most analyses have serious shortcomings regarding the false negative rate, leading to vulnerabilities incorrectly getting classified as unreachable.

It is also worth considering that the are different granularities of reachability. The most fine-grained reachability is function-level reachability offered by Coana and the other vendors we consider here. Other vendors offer coarser and less precise forms of reachability that, for example, only consider if the affected package is loaded at runtime.

The example from follow-redirects above, exposed through NPM package Axios, also illustrates why it is insufficient to only examine direct dependencies.

While the follow-redirects example is based on a JavaScript vulnerability, it is important to emphasize that all programming languages have features that are equally challenging for static analyses, for instance, reflection in Java. Unlike many other reachability providers, Coana’s reachability analysis for Java is capable of handling many uses of reflection. That is because Coana builds a dedicated static analysis for each programming language, which makes the analyses capable of reasoning about hard-to-analyze language features.

Want to learn more?

Schedule Time With a Co-Founder