Minimal LLM-based fuzz harness generator

18th Febuary, 2025

David Korczynski,

Security Research & Security Engineering

In this blog post we will present an example of how to create a minimal LLM-based fuzz harness generator that relies on program analysis tooling to support the process. This is a follow-up blog post to post on Fuzz Introspector: enabling rapid fuzz introspection tool development where we presented how to use Fuzz Introspector as a library for program analysis.

In this blog post we will utilise Fuzz Introspector to extract data about the program under analysis, and use this data to construct a prompt we can run against an LLM and generate a fuzz harness. To this end, the input of our tool will be a codebase and a function name, and the output will be a fuzz harness.

It is important to note at this stage that there are projects focusing on auto-generation of fuzzing harnesses that have worked on this problem for over a year, and have yielded interesting results. For example, Google’s OSS-Fuzz-gen project has a trophy list of 30 issues found in various popular open source libraries, as you can see here. In contrast, what we aim to do in this blog post is carve out a minimal set of tooling that can show some of the core features one can use to create a meaningful auto fuzz harness generation capability.

Intuition for fuzzing workflow

In order to develop our tool, we will create a workflow that resembles how one could go about creating a given fuzzing harness. That is, we want to use program analysis techniques to extract information about the software under analysis similarly to how we would go about studying the codebase if we were to write a harness manually. This information will then be used to compose a prompt that will be supplied to an LLM for the purpose of generating a fuzzing harness.

For the purpose of simplicity, the input to our tool will be a target code base and a name of the function we want to create a harness for. Given this information, a rational procedure for creating a harness would be to:

Find the function in the source code and read the function’s source code.
Study the function signature to identify parameter types.
Look for cross-references in the source code that call into our target function, to get an understanding of how to call the target function.
Use the information from the previous steps to compose a harness.

This is an approach that will work sometimes, but not all times. It’s a reasonably simple procedure, but a tool that can simulate the above steps and also compose a prompt from them will be a useful tool in and of itself.

Creating a data-gathering tool using Fuzz Introspector

In order to extract the data, we are going to use Fuzz Introspector. In our previous blog post, we discussed how fuzz introspector now offers a python library interface for code analysis. We will use these features to make our tool.

The following script uses fuzz introspector to extract the data described above.

import sys import json from fuzz_introspector import commands as fi_commands def extract_data_about_target( language, target_dir, target_function_name): _, report = fi_commands.analyse_end_to_end(arg_language=language, target_dir=target_dir, module_only=True, dump_files=False) project = report['light-project'] # Get target function if target_function_name: function = project.find_function_by_name(target_function_name, True) else: return None if function: # Get the source code of the function as a string function_source = function.function_source_code_as_text() # Get a list of cross-refences xrefs = project.get_cross_references_by_name(function.name) # Convert cross-references into functions as text xref_strings = [ xref.function_source_code_as_text() for xref in xrefs ] context = { 'func_source': function_source, 'func_signature': function.sig, 'xrefs': xref_strings, } return context arg_language = sys.argv[1] arg_target = sys.argv[2] arg_func = sys.argv[3] context = extract_data_about_target( arg_language, arg_target, arg_func) print(json.dumps(context, indent=2))

We can run the above code on a sample codebase as follows:

# Setup environment and install fuzz introspector
python3.11 -m virtualenv .venv 
. .venv/bin/activate 
python3 -m pip install fuzz-introspector==0.1.10 

# Get a codebase
git clone https://gitlab.com/codesun/jakens

# Run analysis on Json_parseFromFile
# https://gitlab.com/codesun/jakens/-/blob/master/jakens.c?ref_type=heads#L642 
python3 ./sample.py c++ ./jakens Json_parseFromFile

The above will analyse the jakens repository and collect the information for the Json_parseFromFile function.

Converting data to a prompt

In order to get a result from the LLM we need to construct a prompt. Thus, the next step is to convert our fuzzing data into a textual representation that can be used as input prompt. This is simply some minor string formatting, which includes an introduction to the LLM, some additional hints, constraints and suggestions regarding fuzzing and what we expect. # Create an introduction prompt = """Hello. You are a %s security engineer and you need to write a fuzzing harness for a codebase you are analysing. The codebase is called %s and the target function you need to write a fuzzing harness for is %s\n"""%(arg_language, os.path.basename(arg_target), arg_func) # Create a description of our target prompt += """The target function has the following function signature: <signature> %s </signature> and the following source code: <code> %s </code> """%(context['func_signature'], context['func_source']) # Create xrefs if there are any if context['xrefs']: prompt += """The function is used in other places of the code. Use these cross-references as examples of how to call the target function in the fuzzing harness you write:\n""" for xref in context['xrefs']: prompt += '<xref>\n%s\n</xref>\n'%(xref) # Provide closing statements prompt += """I expect you to be great at writing fuzz harnesses and already have a lot of experience writing fuzzing harnesses. You should use the knowledge you have to compose the harness for me. Here are a few more guidelines: - The harness you write should be in libFuzzer style. That means, the entrypoint of the harness should be the function <code>int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)</code> - Make sure that the fuzz harness you write will explore code coverage in the target codebase, using the target function provided. The only thing you should return is the code itself. Please do not return any other textual description, and the code you return should be fully compilable. """ print("#" * 20 + " prompt " + "#"*20) print(prompt) print("#"*48)

Connecting to LLM and getting the harness

At this point we have a full prompt, and the next step is to pass the prompt to an LLM and then extract the harness it hopefully generates. In this case, you can use your favourite LLMs for testing. We generally use a mixture of Gemini, Claude and GPT. To use GPT you can use the following code:

llm_prompt = [{'role': 'system', 'content': prompt}] client = openai.OpenAI(api_key=os.getenv('OPENAI_API_KEY')) result = client.chat.completions.create( messages=llm_prompt, model='gpt-4') print("#" * 20 + " result " + "#"*20) print(result.choices[0].message.content) print("#"*48)

Notice that if you use the above code for running the tool, then you should also install openai from PyPI. For this blog post we used version 1.60.1.

Testing on a real codebase

We now have a full script together that can be used as our auto-harnessing tool. To summarise, the complete script is below:

import os import sys import json import openai from fuzz_introspector import commands as fi_commands def extract_data_about_target( language, target_dir, target_function_name): _, report = fi_commands.analyse_end_to_end(arg_language=language, target_dir=target_dir, module_only=True, dump_files=False) project = report['light-project'] # Get target function if target_function_name: function = project.find_function_by_name(target_function_name, True) else: return None if function: # Get the source code of the function as a string function_source = function.function_source_code_as_text() # Get a list of cross-refences xrefs = project.get_cross_references_by_name(function.name) # Convert cross-references into functions as text xref_strings = [xref.function_source_code_as_text() for xref in xrefs] context = { 'func_source': function_source, 'func_signature': function.sig, 'xrefs': xref_strings, } return context arg_language = sys.argv[1] arg_target = sys.argv[2] arg_func = sys.argv[3] context = extract_data_about_target(arg_language, arg_target, arg_func) print(json.dumps(context, indent=2)) # Create an introduction prompt = """Hello. You are a %s security engineer and you need to write a fuzzing harness for a codebase you are analysing. The codebase is called %s and the target function you need to write a fuzzing harness for is %s\n"""%(arg_language, os.path.basename(arg_target), arg_func) # Create a description of our target prompt += """The target function has the following function signature: <signature> %s </signature> and the following source code: <code> %s </code> """%(context['func_signature'], context['func_source']) # Create xrefs if there are any if context['xrefs']: prompt += """The function is used in other places of the code. Use these cross-references as examples of how to call the target function in the fuzzing harness you write:\n""" for xref in context['xrefs']: prompt += '<xref>\n%s\n</xref>\n'%(xref) # Provide closing statements prompt += """I expect you to be great at writing fuzz harnesses and already have a lot of experience writing fuzzing harnesses. You should use the knowledge you have to compose the harness for me. Here are a few more guidelines: - The harness you write should be in libFuzzer style. That means, the entrypoint of the harness should be the function <code>int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)</code> - Make sure that the fuzz harness you write will explore code coverage in the target codebase, using the target function provided. The only thing you should return is the code itself. Please do not return any other textual description, and the code you return should be fully compilable. """ print("#" * 20 + " prompt " + "#"*20) print(prompt) print("#"*48) llm_prompt = [{'role': 'system', 'content': prompt}] client = openai.OpenAI(api_key=os.getenv('OPENAI_API_KEY')) result = client.chat.completions.create( messages=llm_prompt, model='gpt-4') print("#" * 20 + " result " + "#"*20) print(result.choices[0].message.content) print("#"*48)

At this point, we can run our tool on a codebase and test the results. We will continue from the example above, and use the Jakens library https://gitlab.com/codesun/jakens as the target codebase and we will use the function Json_parseFromFile as the target. A sample run of this is:

$ python3 ./sample.py c++ ./jakens/ Json_parseFromFile
{
  "func_source": "JsonDocument Json_parseFromFile(JsonParser self, const char* filename, JsonDocument doc) {\n\t/* use mode r instead of rb */\n\t/* to avoid process the different newline CR on different platforms */\n\tFILE* file = fopen(filename, \"r\");\n\tif(file == NULL) {\n\t\tself->errnum = ERR_FOPEN;\n\t\tfclose(file);\n\t\treturn NULL;\n\t}\n\t\n\tJsonDocument res;\n\tchar buf[BUF_LEN];\n\tsize_t len;\n\tuint8_t isFinish = 0;\n\twhile(!isFinish) {\n\t\tlen = fread(buf, 1, BUF_LEN, file);\n\t\tif(len < BUF_LEN) {\n\t\t\tif(ferror(file)) {\n\t\t\t\tself->errnum = ERR_FREAD;\n\t\t\t\treturn NULL;\n\t\t\t}\n\t\t\tisFinish = 1;\n\t\t}\n\t\tres = Json_parseFromString(self, buf, len, doc);\n\t\t/* if ERR except ERR_NONE and ERR_PEND happen, return immediately */\n\t\tif(self->errnum > ERR_PEND) {\n\t\t\tfclose(file);\n\t\t\treturn NULL;\n\t\t}\n\t}\n\t///* all content is passed to parser */\n\t//if(TopToken(self) != TOKEN_END) {\n\t//\tfclose(file);\n\t//\treturn NULL;\n\t//}\n\t///* top == TOKEN_END && errnum == ERR_PEND */\n\t//self->errnum = ERR_NONE;\n\t/* no error happened, and errnum == ERR_NONE or ERR_PEND */\n\tfclose(file);\n\t/* just let Json_parseFromString to determine if it should return doc */\n\treturn res;\n}",
  "func_signature": "Json_parseFromFile(JsonParser self, const char* filename, JsonDocument doc)",
  "xrefs": [
    "int main() {\n\tJsonDocument_t doc;\n\tJsonParser_t parser;\n\tJsonParser_init(&parser);\n\tJsonDocument res = Json_parseFromFile(&parser, \"test.json\", &doc);\n\tif(res == NULL) {\n\t\tprintf(\"ERROR!\\n Reason: %s\\n\", JsonParser_getErrorMsg(&parser));\n\t\treturn -1;\n\t}\n\tJsonParser_close(&parser);\n\n\tJPath_t path;\n\tJPath_init(&path);\n\tprintf(\"========================\\n\");\n\tchar buf[1024];\n\twhile(1) {\n\t\tprintf(\"Enter the path:\\n\");\n\t\tscanf(\"%s\", buf);\n\t\tJPath pres = JPath_compile(&path, buf);\n\t\tif(pres == NULL) {\n\t\t\tprintf(\"Invalid path!\\n\");\n\t\t\tcontinue;\n\t\t}\n\t\tJsonElement rr = JsonDocument_findElement(&doc, pres);\n\t\tif(rr == NULL) {\n\t\t\tprintf(\"No such element!\\n\");\n\t\t\tcontinue;\n\t\t}\n\t\tprintf(\"RES: \");\n\t\tswitch(rr->type) {\n\t\t\tcase JSON_NULL:\n\t\t\t\tprintf(\"null\\n\");\n\t\t\t\tbreak;\n\t\t\tcase JSON_BOOLEAN:\n\t\t\t\tprintf(rr->val.bol == 1 ? \"true\\n\" : \"false\\n\");\n\t\t\t\tbreak;\n\t\t\tcase JSON_NUMBER:\n\t\t\t\tprintf(\"%lf\\n\", rr->val.num);\n\t\t\t\tbreak;\n\t\t\tcase JSON_STRING:\n\t\t\t\tprintf(\"%s\\n\", rr->val.str);\n\t\t\t\tbreak;\n\t\t\tcase JSON_ARRAY:\n\t\t\t\tprintf(\"ARRAY\\n\");\n\t\t\t\tbreak;\n\t\t\tcase JSON_OBJECT:\n\t\t\t\tprintf(\"OBJECT\\n\");\n\t\t\t\tbreak;\n\t\t}\n\t}\n\tJPath_free(&path);\n\tJsonDocument_free(&doc);\n\treturn 0;\n}"
  ]
}
#################### prompt ####################
Hello. You are a c++ security engineer and you need to write a
fuzzing harness for a codebase you are analysing.

The codebase is called  and the target function you need to write a fuzzing
harness for is Json_parseFromFile
The target function has the following function signature:
<signature> 
Json_parseFromFile(JsonParser self, const char* filename, JsonDocument doc)
</signature> 

and the following source code:

<code>
JsonDocument Json_parseFromFile(JsonParser self, const char* filename, JsonDocument doc) {
	/* use mode r instead of rb */
	/* to avoid process the different newline CR on different platforms */
	FILE* file = fopen(filename, "r");
	if(file == NULL) {
		self->errnum = ERR_FOPEN;
		fclose(file);
		return NULL;
	}
	
	JsonDocument res;
	char buf[BUF_LEN];
	size_t len;
	uint8_t isFinish = 0;
	while(!isFinish) {
		len = fread(buf, 1, BUF_LEN, file);
		if(len < BUF_LEN) {
			if(ferror(file)) {
				self->errnum = ERR_FREAD;
				return NULL;
			}
			isFinish = 1;
		}
		res = Json_parseFromString(self, buf, len, doc);
		/* if ERR except ERR_NONE and ERR_PEND happen, return immediately */
		if(self->errnum > ERR_PEND) {
			fclose(file);
			return NULL;
		}
	}
	///* all content is passed to parser */
	//if(TopToken(self) != TOKEN_END) {
	//	fclose(file);
	//	return NULL;
	//}
	///* top == TOKEN_END && errnum == ERR_PEND */
	//self->errnum = ERR_NONE;
	/* no error happened, and errnum == ERR_NONE or ERR_PEND */
	fclose(file);
	/* just let Json_parseFromString to determine if it should return doc */
	return res;
}
</code> 
The function is used in other places of the code. Use these
cross-references as examples of how to call the target function in the fuzzing
harness you write:
<xref> 
int main() {
	JsonDocument_t doc;
	JsonParser_t parser;
	JsonParser_init(&parser);
	JsonDocument res = Json_parseFromFile(&parser, "test.json", &doc);
	if(res == NULL) {
		printf("ERROR!\n Reason: %s\n", JsonParser_getErrorMsg(&parser));
		return -1;
	}
	JsonParser_close(&parser);

	JPath_t path;
	JPath_init(&path);
	printf("========================\n");
	char buf[1024];
	while(1) {
		printf("Enter the path:\n");
		scanf("%s", buf);
		JPath pres = JPath_compile(&path, buf);
		if(pres == NULL) {
			printf("Invalid path!\n");
			continue;
		}
		JsonElement rr = JsonDocument_findElement(&doc, pres);
		if(rr == NULL) {
			printf("No such element!\n");
			continue;
		}
		printf("RES: ");
		switch(rr->type) {
			case JSON_NULL:
				printf("null\n");
				break;
			case JSON_BOOLEAN:
				printf(rr->val.bol == 1 ? "true\n" : "false\n");
				break;
			case JSON_NUMBER:
				printf("%lf\n", rr->val.num);
				break;
			case JSON_STRING:
				printf("%s\n", rr->val.str);
				break;
			case JSON_ARRAY:
				printf("ARRAY\n");
				break;
			case JSON_OBJECT:
				printf("OBJECT\n");
				break;
		}
	}
	JPath_free(&path);
	JsonDocument_free(&doc);
	return 0;
}
</xref>
I expect you to be great at writing fuzz harnesses and already
have a lot of experience writing fuzzing harnesses. You should use the
knowledge you have to compose the harness for me. Here are a few more
guidelines:

- The harness you write should be in libFuzzer style.

That means, the entrypoint of the harness should be the function
<code>int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)</code> 

- Make sure that the fuzz harness you write will explore code coverage in the
target codebase, using the target function provided.


The only thing you should return is the code itself. Please do not return any
other textual description, and the code you return should be fully compilable.

################################################
#################### result ####################
```cpp
#include <stdint.h> 
#include <stddef.h> 
#include <stdio.h> 

extern "C" {
    #include "json.h"
    
    // Assuming the headers and source files are correctly included
    JsonDocument Json_parseFromFile(JsonParser self, const char* filename, JsonDocument doc);
}

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {

    // Writing received data to a temporary file.
    FILE *temp = fopen("temp.json", "wb");
    if (!temp) {
        return 0;
    }
    fwrite(data, 1, size, temp);
    fclose(temp);
    
    // Initialize JsonParser and JsonDocument objects
    JsonParser_t parser;
    JsonParser_init(&parser);
    JsonDocument_t doc;
    
    // Call the target function 
    Json_parseFromFile(&parser, "temp.json", &doc);
    
    // Free the allocated resources
    JsonParser_close(&parser);
    JsonDocument_free(&doc);

    // Successful execution
    return 0;
}
```
################################################

Inspecting the result we get, shown above after the "#################### result ####################" print out, an initial assessment is the harness looks great. It correctly creates a file with fuzzer-generated data and also calls into the target function, while also initialising the proper arguments to Json_parseFromFile.

For the purpose of this exercise, we did not try to compile and run the above harness. Rather we made a manual assessment and can confirm that the above harness is more or less what we would try as a first iteration of fuzzing the target function and library. This also shows the limits to our pure static approach: in order to validate the correctness and quality of the harness, we need to build, run and assess the results from the run. This, however, is out of scope for this blog post.

Summary and moving further

In this blog post we have shown how we can create a minimally viable, and useful, fuzz harness generator using Fuzz Introspector and LLMs and 50 lines of Python. In this context, we relied on 6 pieces of information to pass to the LLM, namely: the target language, the target “project name”, the target function name, the target function signature, the target function source code, a list of cross-references, and we demonstrated this can be enough to generate a valid fuzz harness.

The tooling we developed relies on light program analysis to construct an LLM prompt that has shown to produce interesting output. It's a lightweight that can be extended in many ways, is not resource extensive with respect to LLM communication, and at this stage is already quite useful as a helper tool. There are, naturally, many directions this can move and these should be explored in future work.

Moving beyond what is presented above, there is a lot more information available that we can use in a fuzzing workflow, including:

Instead of providing a target function, we can use Fuzz Introspector to extract a set of “ideal targets” (similar to what we describe here) and generate a harness for each of these.
Include reasoning about code coverage.
Include reasoning about existing harnesses.
More information about the types used in the code.
Conversion of tests into fuzz harnesses.

Several of these are already integrated into OSS-Fuzz-gen. For example, OSS-Fuzz-gen recently added support for a “from scratch” workflow here. OSS-Fuzz-gen even supports a command-line utility for generating harnesses, which makes it very easy to deploy, and develop, auto-harnessing.

The CLI offered by OSS-Fuzz-gen can be used to also suggest good target functions for fuzzing. In our example we had to provide the function name, but an even more autonomous approach would be able to do this as well. This can be achieved by combining our tool above with techniques described in our earlier blog post on fuzz introspection in Python here OSS-Fuzz-gen, however, already supports this and we can use their tooling against the same codebase as above to extract a set of good harnesses rather than for a single function, using the following commands:

python3 -m virtualenv .venv
. .venv/bin/activate

git clone https//github.com/google/oss-fuzz-gen
cd oss-fuzz-gen
python3 -m pip install .

cd /some/random/path
git clone https://gitlab.com/codesun/jakens
fuzz-generator -l c++ -m ${MODEL} -t jakens/ --far-reach -o generated-harnesses

# Find and print all generated harnesses
find ./generated-harnesses -name "*.rawoutput" -exec cat {} \;

To summarise, in this blog post we showed how to use program analysis to construct a capability for auto generation of fuzzing harnesses by way of LLMs. We showed how this can be leveraged against an open source code base, and also discussed possible extensions and current state of the art.

Program analysis tooling leveraging LLMs for code reasoning and synthesis is here to stay and we pride ourselves for being at the forefront of this. If you have needs or ideas for tool development then please contact us and we would be happy to offer our help.