The importance of continuity in fuzzing - CVE-2020-28362

Adam Korczynski18th february 2021 | online training fuzzing go-fuzz

Fuzzing is an essential technique in analysing software for security issues and the technique is widely used in software development and software security practices. Although it is widely agreed that fuzzing is highly useful there is less of a consensus on how to manage running your fuzzers. A common practical issue is to determine for how long a fuzzer should be run and this problem becomes even more apparent if the fuzzer does not find any bugs.

In recent years the concept of continuous fuzzing has gained a lot of attention and the goal behind this is to make it easy for developers and security practitioners to manage the execution of fuzzers, and in particular in a way that makes the fuzzers run for a long period of time and continuously as the software project targeted by the fuzzer evolves. Both Google and Microsoft have open source tooling for this by way of Clusterfuzz and Onefuzz, respectively. In addition to this, Google has a free service called OSS-Fuzz where they continuously fuzz important open source projects.

In this blog post I will cover an anecdote related to this problem, based on a recent experience with a security critical DoS issue in Go-Ethereum, namely CVE-2020-28362. The CVE itself was in the Go standard library and was found by fuzzing Go-ethereum.

Go fuzzing resources

This blog will be about fuzzing Go, if you are unfamiliar with Go fuzzing then you can:

1) see our previous blog on Go fuzzing

2) Watch our video introducing Go fuzzing.

Timeline & Observation

In the summer 2020 I reached out to the Go-ethereum team for the purpose of setting up continuous fuzzing for Go-ethereum. The team had already done a fair amount of work on fuzzing the library and a substantial fuzzing suite existed already. However, it was unclear how often the fuzzers were run, and thus came about the goal of setting up continuous fuzzing of the project by way of Google OSS-fuzz.

In October we had sorted out the practicalities and Go-ethereum was integrated into OSS-fuzz on 19th october 2020. At that point all the Go-ethereum fuzzers except one would start running.

5 days after integration a crash was found by OSS-fuzz that upon investigation would turn out to be CVE-2020-28362. This CVE was a critical DoS vulnerability that could have taken down major parts of the Ethereum network. The key factors that make this case particularly interesting are:

1: The fuzzer that found the bug already existed prior to continuous fuzzing integration, and the Go-ethereum team had been running them on an ad hoc basis.

2: The code that introduced the bug in Go was committed roughly a year before the fuzzers found it.

Remember the upstream #golang bug from 2 wooks ago that we said could take down the entire #Ethereum network? The same bug that could remotely crash SSH servers or code using RSA or x509 certs? A zero day in Go since forever?

Myeah... lemme present you the fix! 🤯🤪🤓🙃 pic.twitter.com/JrI2LsMvaN

— Péter Szilágyi (@peter_szilagyi) November 26, 2020

[CVE-2020-28362] OSS-Fuzz got a critical DoS fixed in Golang which could bring down the Ethereum network - https://t.co/TC2kHQR3xQ. Thanks to @ADALogics team for this integration and helping the OSS community.

— Abhishek Arya (@infernosec) December 17, 2020

Experiments

It was a curious situation where simply deploying a bit more computational power to the fuzzers resulted in a critical issue despite the fuzzers having been around for a long time. In order to understand a bit more about the behaviour of the Go-Ethereum fuzzers, I decided to do some experiments with the coverage explored by the fuzzers.

I set up an experiment where each fuzzer ran for 240 hours and then plotted the number of corpus files generated by the fuzzers in respect to the number of hours spent fuzzing. The reason we count the number of corpus files is because these correlate with the coverage achieved by the fuzzers, i.e. when the fuzzers discover new coverage it will create an input file triggering this coverage. The number of corpus-files were counted every minute and the experiment included 11 out of the 20 fuzzers in the Go-Ethereum repository. 7 of the 11 fuzzers that we ran target low-level APIs and their coverage stalled quickly. The remaining 4, however, continued to grow consistently and had no sign of stalling.

Below we show the increase of the corpus files of four Go-ethereums fuzzers each run for 240 hours of fuzzing. Along the x-axis is the number of hours that the fuzzers ran and the y-axis shows the number of corpus files. Note that the fuzzers were stopped for a day at hour 50 and hour 78.

Observations & Conclusions

As we can clearly see, throughout the runs each fuzzer consistently added new files to its corpus, and correspondingly explored new parts of the codebase. Even when we ceased each fuzzers execution, they were all still consistently producing new corpus files.

On january 4th 2021 the bug report in OSS-fuzz for the bug that was found from in October was made public, and in it we see that the fuzzer that found the bug was fuzzVmRuntine which is one of the fuzzers that ran for 240 hours and that continued to generate new coverage.

I found this an interesting example of how important it is to perform continuous fuzzing in comparison to running fuzzers on an ad hoc basis. OSS-Fuzz is a great way to manage the execution of fuzzers for open source projects and we highly recommend it.

Finally, I wanted to note that there are still open problems in terms of managing fuzzer execution. This is a non-trivial problem and a great addition to fuzzer workflow would be to receive introspection about fuzzer execution that provides insights about how much analysis a given fuzzer has achieved with respect to its total sum of possible analysis.