Interview with Paul Anderson, VP of Engineering at GrammaTech
How copying and pasting snippets of code can replicate bugs and lead to new vulnerabilities
Nobody wants to rewrite parts of their code over again and again as they develop their applications, so copying and pasting snippets of code is a common development practice. But problems occur when that snippet contains an undiscovered error or when the copied snippets are changed by developers who introduce new errors in their changes. In this interview, Paul Anderson, VP of Engineering at GrammaTech, explains the risks and provides advice.
Q: Why is this topic important?
Copy and paste bugs are a particularly interesting form of bugs because of their unique properties compared to other forms of bugs. Copy and paste bugs are subtle and difficult to identify once introduced. And, they can replicate through the copy and paste process and amplify vulnerabilities.
Q: What is a cut and paste bug?
Let’s say that you’ve got some code that does something with an identifier. You want to do something similar with another identifier in a similar context. You take the code that you are pretty sure works, and you copy it and paste it into its new context and then adjust it, meaning you’re changing it to make it match the new context.
This creates two classes of problems: First, the code you copied and pasted wasn’t right to start with, so you’ve just replicated a bug. Second, mistakes are made when developers change variables and fields within the code. A single character change from an X to a Y axis, for example, can miss some of the character changes required to run the program. So, if there are half a dozen places where you need to change the X to a Y, and if there’s one you miss, it introduces a bug. Alternatively, a wrong character gets changed because all X’s might look the same to the code editor so it might change an X somewhere that’s not related to the axis.
Pro Tip: Read this blog about some of the specific copy/paste bugs that CodeSonar uncovered in Postgres, FFmpeg, Open Office, LLVM, and at several other open source components.
Q: How do those bugs become vulnerabilities?
A bug introduced through copy and paste may be harmless. Or it can potentially lead to computing the wrong result, which may cause the system to crash or introduce a security vulnerability. For example, we examined some code in a high-level open-source database where the code involved converting the database from one form to another. The bug could have ended up corrupting data and the contents of the database, possibly the entire structure of the database.
Pro Tip: Check out this blog in the Daily Swig that lists zero days, XSS, and sanitizer bypass exploits related to cut and paste.
Q: What advice do you have for developers?
If you can avoid cutting and pasting in development, and use abstraction instead, that is the best protection. Barring that, developers and testers need to be aware that small changes in multiple lines of code are difficult for humans to spot and review their code accordingly.
The rule of thumb is for every ten lines of new code there’s a fifteen percent chance you have a bug in one of those lines of code. So, be sure that before you copy and paste, you’re confident that what you’re copying from is correct. If you do introduce errors, and if you’re lucky, the compiler will recognize the mistake and point it out to you, then you can fix it and get on with further development. But in most of the bugs we’ve discovered, the compiler hasn’t been able to help.
The other best practice is to use good code editors that minimize the chances of introducing a copy paste error. But not all editors have such features. Finally, it’s important to perform static analysis testing on the code during development and as soon as it compiles to make sure you are finding bugs before they go into production.
Pro Tip: check out this article in JavaScript for additional advice around the risks of cutting and pasting code.