LLVM Fuzzing Audit

1st March, 2024
David Korczynski,
Security Research & Security Engineering

We recently conducted a fuzzing audit of LLVM and are happy to present the results as well as the extensive report. The goal of the audit was to generally improve the fuzzing set up of LLVM with a particular focus on its continuous fuzzing by way of OSS-Fuzz. The audit was facilitated by the Open Source Technology Improvement Fund and funded by the Sovereign Tech Fund.

Ada Logics has extensive experience in fuzzing, for example, we contribute a lot of code to Fuzz Introspector and often contribute fuzzing to open source projects. To this end, in the initial assessment of LLVM's fuzzing set up we identified and prioritised the tasks needed to have impact on the fuzzing of LLVM. Throughout the engagement we fixed the existing OSS-Fuzz LLVM fuzzing set up, extended existing fuzzers, added new fuzzers, patched issues found by fuzzers and developed a strategy on how to move the LLVM fuzzing set up forward.

LLVM was integrated into OSS-Fuzz August 2017. At this point in time there were around 90 projects in OSS-Fuzz (in contrast to more than 1200 now), which makes it one of the projects that has been in OSS-Fuzz for the longest period of time. In total, OSS-Fuzz has reported more than 2770 issues in LLVM and there are around 400 open issues at the moment. The LLVM OSS-Fuzz project is public by having no view restrictions which means that anyone can (1) view the issues reported by the OSS-Fuzz setup, and (2) download the reproducer test cases to reproduce any of the reported findings. As such, anyone can monitor and reproduce the issues discovered without any limitations on deadlines, i.e. issues are made public when they are found and do not have any embargo on them.

The LLVM project has extensive fuzzing, however, it lacks efficiency in certain areas that means the existing set up does not reach its full potential in terms of memory corruption issues. In order to improve the chance of the fuzzers finding memory corruption issues in LLVM we recommend addressing efficiency issues in the fuzzing set up, and estimate once this has been done a significant amount of the LLVM codebase will be covered by fuzzing.

In summary, during the engagement we: