Access member only content, take part in discussions with comments on blogs, news and reviews and receive all the latest security industry news directly to your inbox. Join now for free.
Processing registration... Please wait.
This process can take up to a minute to complete.
A confirmation email has been sent to your email address - SUPPLIED EMAIL HERE. Please click on the link in the email to verify your email address. You need to verify your email before you can start posting.
If you do not receive your confirmation email within the next few minutes, it may be because the email has been captured by a junk mail filter. Please ensure you add the domain @scmagazine.com.au to your white-listed senders.
German university researchers have used machine learning to increase the efficiency of security vulnerabilities discovery in source code.
Modern techniques exist to speed-up detection of vulnerabilities, like fuzz testing, taint analysis and symbolic execution, but researchers Fabian Yamaguchi, Felix Lindner, and Konrad Rieck said the process was still tedious.
“The discovery of vulnerabilities in source code is a challenging and fascinating task – for blackhats as well as whitehats. Security bugs are often deeply buried within a code base and considerable expertise is required to bring these gems to the surface,” the researchers said.
“In recent years, several tools for automatically identifying vulnerabilities have been developed. However, these tools rather aim for spotting low-hanging fruit vulnerabilities and suffer from the inherent inability of one program to completely analyse another program's code.”
The method, dubbed vulnerability extrapolation, was an alternative to automated vulnerability detection that aimed to improve the more powerful manual inspection of source code.
It identified unknown vulnerabilities using programming patterns observed in known security flaws, since vulnerabilities were often linked to patterns of specific API usage.
“The method embeds code in a vector space, such that typical patterns of API usage can be determined automatically using machine learning techniques. These patterns implicitly capture semantics of the code and extrapolate from known vulnerabilities to identify potentially vulnerable code with similar characteristics. This process of vulnerability extrapolation can suggest candidates for investigation to the analyst as well as ease the browsing of source code during auditing.”
“The method embeds code in a vector space, such that typical patterns of API usage can be determined automatically using machine learning techniques. These patterns implicitly capture semantics of the code and extrapolate from known vulnerabilities to identify potentially vulnerable code with similar characteristics.
This process of vulnerability extrapolation can suggest candidates for investigation to the analyst as well as ease the browsing of source code during auditing.”
Researchers demonstrated the method in two experiments: The first evaluated the ability of extrapolation to identify API usage patterns and to structure source code.
In the second experiment, the technique was applied to libraries produced by the ffmpeg project. Researchers narrowed a search for “interesting code” from 6778 to 20 functions, and discovered a known and zero-day vulnerability.
Researchers said the technique could capture many vulnerabilities by analysing API usage patterns but may miss others that would be identified by examining the code structure of a function.
They were investigating techniques for integrating structural information from source code into vulnerability extrapolation.
“The ability of our approach to narrow the auditing process to a few interesting functions may also play well with software testing, for example, for selectively fuzzing functions or performing involved symbolic execution,” they said.
The research paper is available online (pdf).
Copyright © SC Magazine, Australia
To begin commenting right away, you can log in below or register an account if you don't yet have one. Please read our guidelines on commenting. Offending posts will be removed and your access may be suspended. Abusive or obscene language will not be tolerated. The comments below do not necessarily reflect the views or opinions of SC Magazine, Haymarket Media or its employees.