Online Course: LLVM for Security Engineering and Program Analysis

Abstract

LLVM is a modular compiler infrastructure that is used to write security-aware compilers, sophisticated fuzzers, large-scale vulnerability discovery tools, symbolic executors and much more. This course introduces LLVM with a focus on security tool development.

The course is centred around two core parts. First, it will cover the internals of LLVM and how to write applications that use LLVM to solve program analysis problems. Second, it will introduce the student to several state-of-the-art security tools that use LLVM. We will study these tools from both a black box perspepctive by using and applying them on software, as well as study their internals by reading and extending their source code.

This course is not for beginners, and you should feel comfortable with programming in C/C++, including programming up against large code bases, and reading assembly language to do this course.

Course Objectives

Learning Objectives

Understanding LLVM internals including LLVM Intermediate Represenation

Writing program analysis and program transformation tools to solve software security issues.

Understanding how LLVM is used by several state-of-the-art research tools.

Apply your knowledge in a pragmatic way to develop your own tools.

Prerequisites

Good understanding of computer systems and assembly-level reasoning. The course is heavily focused on development and exposure to C/C++ coding is a must. Experience with assembly language is assumed but deeper knowledge about compiler internals is not required.

Who should attend?

Security researchers.

Security engineers

Low-level engineers.

Compiler engineers.

Any others who need to develop automated techniques to reason and instrument assembly-level code.

Course Syllabus

The course is divided in the following main sections. Each section is composed of a myriad of videos, notes, exercises, assignments and other forms of interactive learning.

Course introduction

This section introduces the course and gives an overview of the topics covered.

LLVM: a modern compiler infrastructure

This section gives an introduction to the LLVM infrastructure from a high level perspective and then proceeds to go in details with how LLVM represents code. The goal of this section is to provide the fundamentals of LLVM to give a strong enough understanding that makes it possible to write complex program analysis tools.

Writing LLVM analysis passes

This section introduces the LLVM API and how to write LLVM passes. We will focus on writing analysis passes, namely tools that work by reading the code of the target under analysis. Throughout this section we will write a sophisticated code analysis tool that can aid in the vulnerability analysis process.

Writing LLVM instrumentation passes

This section introduces the LLVM API and how to write LLVM instrumentation passes. The key attribute of the instrumentation passes is that they add code to the program that is being compiled and the goal of this code is to have effect in the final executable, e.g. performing run time analysis.

Obfuscating compiler with LLVM

In this section we are going to study the concept of obfuscating compilers. Obfuscating compilers transform the source code your are compiling into gibberish that is difficult to reverse engineer. This is a concept where several of the proposed tools rely on LLVM, and in this section we will study this topic close up.

LLVM-based symbolic execution

In this section we are going to study symbolic execution and how proposed techniques rely on LLVM. Symbolic execution is a technique for reasoning about a program execution by way of mathematical expressions, and then solving these mathematical expressions as a way of deriving concrete inputs that satisfy some state in the program execution. We use symbolic execution to generate program inputs that explore all executions paths of a program.

Lifting binary code to LLVM

In this section we will study the concept of converting binary executable code to LLVM bitcode.

Course conclusion

In this section we conclude the course and outline interesting avenues for further study.

You get

Lecture videos

Lecture notes

Hands-on exercises of varying difficulty

24/7 access to platform and self-paced course

6 months subscription to online training platform

Instructor support throughout entire course

Course updates during subscription period