ValidScript – a modest proposal for app security

23Sep21

TL;DR

Bad input validation is the main underlying cause of many application security issues, because we haven’t made it easy enough for developers to implement good input validation. So how about a TypeScript[1] like language to resolve that – ValidScript – a language that makes it easy to do input validation?

Background

Wendy Nather recently asked me:

Survey for my talk at OWASP’s 20th Anniversary conference:

In the last 20 years, what’s one of the most important things you personally have learned about appsec?

After not much thought my answer was:

Input validation should be baked into languages and frameworks, to make it stupid easy for developers to write safe apps, but still isn’t.

I then went on:

My thinking here is that if there was a language (likely a JavaScript derivative like TypeScript) that treated input as UNSAFE until it washed through a set of standard validators, then we could get to the place on input safety that we seem to have achieved with memory safety in Rust. The compiler would essentially support an input taint checker.

Wendy suggested that I should blog about it. This is the post. I’m calling my invented language ‘ValidScript’, and I’m somewhat amazed that the name isn’t already taken[2].

The problem

The OWASP Top 10 has pretty much remained the same for the whole time it’s existed. The ordering might shuffle around a bit, but the underlying problems remain firmly entrenched.

The root cause for many of those underlying problems is not doing (adequate) input validation.

Why?

Because input validation hasn’t been made easy enough. Because in every popular language it’s still left as an open ended exercise for the developer to write their own validator.

We’ve made great progress on memory safety

The proliferation of garbage collected languages made it harder to coerce a buffer overflow from some bad input, and then Rust came along to provide memory safety without the garbage collection overhead (you just have to fight with the compiler borrow checker instead).

But that doesn’t really solve the problem

Buffer overflows are just one of the things that can go wrong. Bad input can still go on to cause database injection, cross site scripting, insecure deserialization etc.

An example

I maintain some scripts to dump cards from GitHub projects into a .csv file that can be imported into Planning Poker. Our scrum master, who’s the primary user of the scripts, complained that import had been truncated to just 7 cards (from 18). I took a look at the file[3], and it was quickly clear what had gone wrong. Somebody had put a comma into an issue title, resulting in too many columns in that row, resulting in a bad import. I’d failed at input validation (and frankly so had the Planning Poker importer[4]).

I’d note that this code doesn’t even directly take user input. It’s reading stuff out of the GitHub REST and GraphQL APIs, which both output JSON. But valid JSON doesn’t necessarily make for valid .csv.

Of course I can take to Stack Overflow and find out how to strip out any commas with something like:

title = card["title"].replace(",", "")

But that doesn’t deal with other special characters that might cause trouble in my .csv, and it quickly becomes unwieldy (and slow) if I run the string through multiple replace operations.

So back to Stack Overflow for a more general purpose approach:

title = re.sub('[^A-Za-z0-9]+', '', card["title"])

But that strips out all the spaces, and a few other characters that I still want, like @ and .

Also I see some very long titles, that I want to truncate, which means I end up with:

title = re.sub('[^A-Za-z0-9.@ ]+', '', card["title"])[:80]

This should not involve Google and Stack Overflow

My modest proposal is that the ValidScript language has input validation built in.

If you want your code to compile, then you have to specify where input is going, so that an appropriate validator can be applied.

For the case above I’m putting my input (from the GitHub API) into a .csv file, so I’d choose the CSV validator.

The validators can of course be overridden, but that’s an active choice, and the aim is to have safe defaults.

Conclusion

Input validation should be a first class construct of a programming language, and that’s what ValidScript would do. To make it easy to do input validation, to make it easy to avoid OWASP Top 10 mistakes.

Notes

1. I’m not a huge JavaScript fan, but I get the reasons why it’s #1, so building on the TypeScript approach seems like a pragmatic way of reaching the most people. I’d also note that most of the issues come from strings, so extending the TypeScript approach to better string safety seems sensible.

2. I already grabbed validscript .com, .org & .net and for now I’ll get them redirected to this post.

3. Looking closer at the file it’s almost like The @ Company team were playing a game of bad input golf. Double colons, leading spaces, mismatched quotes, the list goes on.

4. The importer shouldn’t have failed on one bad line, and I’d expect it to continue with the other lines.



No Responses Yet to “ValidScript – a modest proposal for app security”

  1. Leave a Comment

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.