SAST is just crazy bad at XSS

XSS is one of the more serious things in appsec, and it's pretty prevalent. It’s also one of those things that is super hard to find accurately via static analysis. And, vendors don’t want to miss it — so, typically any data that gets to a response, no matter how tenuous the connection, is reported as XSS.
Sis, we have to show data that came from end users. This "guilty until proven innocent approach" really doesn’t scale well, but thankfully is an area that AI can really help patch up the rough edges.
Why is it so difficult?
Your data flow engine (if you're lucky enough to be using an interfile, type-aware, big box engine) is pretty generic. It takes 50+ programmers to make an engine just good enough to be able to reliably, in a predictable time, answer questions as basic as “did data from API X get to API Y?” The problem is that coarse-grained of a question is missing a ton of context about how XSS works — that can’t be captured in an engine with that simple lens of the world. The engine is missing knowledge about the data itself, the frameworks in use, the response, the context it's written, and finally the consumer of the response -- it's a perform storm of confusion.
Example 1 out of 10 million
Here’s an anonymized XSS we correctly classified as a false positive:
@RestController
@RequestMapping("/api/profile")
public class ProfileController {
// POST /api/profile?displayName=<script>alert(1)</script>
@PostMapping
public ResponseEntity<UserProfile> createProfile(@RequestParam String displayName) {
UserProfile profile = new UserProfile();
profile.setDisplayName(displayName);
userService.createProfile(profile);
// Jackson implicitly serializes this POJO as JSON and sets Content-Type
// ⬇️ SAST flags this line as “untrusted data flows to response”
return ResponseEntity.ok(profile);
}
}
/** Simple DTO. */
class UserProfile {
private String displayName;
}
Yes, if we follow the taint in a simple straight line, the taint flows into the UserProfile
object, and then it conceptually reaches the HTTP response with the concrete call to ok(profile)
. There’s a big difference between conceptually and I need to go bother developers and cry foul.
This is a false positive for two reasons.
One, the data is serialized by Jackson, and so the user-provided data is automatically escaped as part of the JSON encoding. So, any attack there will be unable to jump the data boundary it finds itself in and attempt to be interpreted as code.
Two, the implicit setting here of Content-Type: */json
makes this unexploitable, even if the escaping failed for some reason. Browsers used to do “MIME sniffing” — peeking at the data to try to understand what it is, so it could be rendered appropriately — and attackers abused this to trick browsers into executing their cargo code of a different kind than the rest of the response. They patched this up a long time ago.
How come SAST can’t do this?
To correctly diagnose these issues, tools must understand at a much higher level the relationship between the frameworks, the data consumers, what’s happening at runtime, and more. E.g.:
What exactly
Response.ok()
does with data (and how it could be configured to do otherwise)The implicit behavior of
@PostController
(how it automatically delegates to the default JSON serializer)The fact that the default serializer is a reliable escaper (Jackson)
The “data flow” and the “framework route” are not the same, and the engines haven’t been retrofitted to join these disparate flows
How the browsers interpret data
About, I dunno, 20% of the complexity of XSS is captured in the simple, source-sink flow identified by SAST. The other 80% has to come from somewhere, and it won’t come from static analysis rules getting 5% better. There’s also nothing really special about XSS in this example — most vulnerabilities carry this level of complexity.
Between the difficulty in statically deciphering this extra (many times implicit) context, and the penalty for false positives having traditionally not been high enough for vendors, we seem to be okay just letting frontline appsec or developers triage a lot of noise.
AI Can Help Turn This Weakness Into Strength
We’ve spent the last few years cataloging these failure modes, and teaching AI agents (and workflows!) to investigate and adjudicate the findings from your perspective.
It can answer many of these questions, even if they’re using custom frameworks or unpopular languages.
It can research the latest framework APIs to confirm behaviors.
It can decipher intent, which helps in understanding the “bigger picture” of some code, and guide severity or classification.
So, if we’re “grabby” up front with a bunch of findings, the noiser the tool, the better! (Assuming you don’t mind the cost of the tokens!)
We typically analyze scans with hundreds of findings that, after triage, have zero or close to zero true positives — and many times they are lower in severity than reported. But, at least the few remaining true positives will be the right ones to send!
AppSec is hard and detailed. If you can, let a machine do it for you.
Subscribe to my newsletter
Read articles from Arshan Dabirsiaghi directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Arshan Dabirsiaghi
Arshan Dabirsiaghi
CTO @ Pixee. ai (@pixeebot on GitHub) ex Chief Scientist @ Contrast Security Security researcher pretending to be a software executive.