Formal Methods for Binary Analysis: Tools and Workflows

Part 1 covered the foundational concepts: symbolic execution, SMT solving, and abstract interpretation. Here I’ll walk through the tools that implement these techniques and how to use them effectively. These are the tools I reach for regularly in my own work. Prerequisites: Familiarity with Python, basic reverse engineering concepts, and ideally a disassembler like Ghidra, IDA, or Binary Ninja. Code examples assume angr and Z3 are installed (pip install angr z3-solver).

Concolic Execution: The Practical Paradigm

Pure symbolic execution has a problem: path explosion. Every branch potentially doubles the paths to explore, and real programs have many branches. A loop with N iterations and one branch per iteration creates up to 2^N paths. Concolic execution (concrete + symbolic) is the dominant practical solution. It runs the program with concrete inputs while simultaneously building symbolic constraints. When a path completes, the solver negates one branch constraint to generate inputs for a different path. The key advantage: concolic execution always makes progress. Pure symbolic execution can get stuck on complex constraints or unknown functions. Concolic execution falls back to concrete values when symbolic reasoning fails, then resumes symbolic tracking when possible. Modern binary analysis tools often use hybrid approaches, combining symbolic reasoning with concrete execution fallbacks. Pure symbolic execution remains viable for many targeted analyses, but hybrid approaches handle real-world complexity better.

The Tool Ecosystem

angr

angr is the most comprehensive Python framework for binary analysis. It combines symbolic execution, static analysis, and emulation across multiple architectures. angr lifts binary code to VEX, an intermediate representation from Valgrind. This IR abstraction lets the same analysis code work on x86, ARM, MIPS, and other architectures. The constraint solving frontend (Claripy) integrates with Z3 for constraint solving. angr has a steep learning curve but offers unmatched flexibility. It’s the right choice for complex analysis tasks that require customization.

Z3

Z3 from Microsoft Research is the dominant SMT solver in security research. Even when not using symbolic execution, you can model constraints directly in Z3 to solve reverse engineering puzzles. Z3’s Python bindings make experimentation easy:

from z3 import *
x = BitVec('x', 32)
s = Solver()
s.add(x ^ 0x5A == 0x1337)
s.check()  # sat
s.model()  # [x = 4973]

For quick constraint problems like license key checks, hash collisions, and encoding puzzles, Z3 alone may be all you need. If you’re already in an angr workflow, claripy provides the same solving capability:

import claripy
x = claripy.BVS('x', 32)
s = claripy.Solver()
s.add(x ^ 0x5A == 0x1337)
print(s.eval(x, 1))  # (4973,)

Claripy uses Z3 under the hood but integrates seamlessly with angr’s symbolic state.

Triton

Triton performs dynamic symbolic execution through binary instrumentation. While angr emulates execution internally, Triton can instrument actual program execution, hooking into runtime to build constraints on-the-fly. Triton excels at analyzing specific execution traces and offers a simpler API than angr. The tradeoff is that you typically need either a running target or captured execution traces.

Manticore

Manticore from Trail of Bits provides symbolic execution with particularly strong support for Ethereum smart contracts alongside traditional binaries. Note: Manticore development has slowed significantly since 2022. For new projects, verify the current maintenance status before building on it.

Ghidra and Binary Ninja

Modern disassemblers like Ghidra and Binary Ninja provide intermediate representations (P-Code and BNIL respectively) that normalize complex instruction semantics. While primarily used for manual reverse engineering, these IRs can serve as foundations for custom analysis tools.

Practical Workflow

A typical workflow combining manual and formal analysis:

1. Triage in Ghidra/IDA/Binary Ninja

Identify your target: a specific function, a license check, an input parser. Understand calling conventions, note interesting comparisons or branches, map out the control flow you care about. Don’t try to symbolically execute entire programs. Pick focused targets where formal methods will help.

2. Set Up Symbolic State

In angr (or your tool of choice), create a state at your entry point. This is often the hardest part. Make concrete what you know: buffer sizes, structure layouts, global state, return addresses for functions you’ll hook. Make symbolic what you don’t: user input, file contents, network data, whatever you’re trying to solve for.

import angr
import claripy

# Disable auto-loading libraries to avoid path explosion
proj = angr.Project('./binary', auto_load_libs=False)

# Create symbolic argument (100 bytes)
argv1 = claripy.BVS('argv1', 8 * 100)

# Create state with symbolic argv attached
state = proj.factory.entry_state(args=['./binary', argv1])

3. Define Success and Failure

Tell the engine what “interesting” means:

Find addresses: reach the “success” branch, the vulnerable function, the decryption routine
Avoid addresses: error handlers, exit calls, paths you’ve already analyzed

simgr = proj.factory.simulation_manager(state)
simgr.explore(find=0x401234, avoid=0x401300)

# Extract the solution if found
if simgr.found:
    found_state = simgr.found[0]
    solution = found_state.solver.eval(argv1, cast_to=bytes)
    print(f"Solution: {solution}")

4. Constrain the Search

Path explosion is your enemy. Aggressively prune the search space:

Hook library calls to return symbolic but constrained values rather than exploring libc internals
Bound loops when you know iteration counts
Concretize values you don’t care about solving for

5. Interpret Results

Solver outputs need interpretation. A “valid” input may require impossible preconditions. Always validate findings against the actual binary. Run the concrete input and verify it produces the expected behavior.

Limitations

Formal methods aren’t magic. Understanding their limitations helps you use them effectively.

Path Explosion

Even with concolic execution and pruning, complex programs can overwhelm analysis. Techniques like state merging, loop summarization, and directed search help, but some programs remain intractable. What to do: Focus on small, targeted analyses. Analyze one function, not the whole program.

Environment Modeling

Programs interact with operating systems, networks, and files. Symbolic execution needs models of these interactions. Imperfect models lead to imperfect analysis: false paths explored, real paths missed. What to do: Hook external calls and provide appropriate symbolic models. angr includes SimProcedures for common library functions.

Constraint Complexity

Even when path explosion is managed, individual constraints can become extremely complex. Cryptographic code is notorious for generating solver timeouts. What to do: Concretize values where possible. If you’re not solving for the encryption key, don’t make it symbolic.

Indirect Control Flow

Computed jump targets (jump tables, virtual dispatch, function pointers) are challenging. The symbolic engine may not know what addresses are reachable. What to do: Provide hints about valid targets, or use static analysis to resolve indirect jumps before symbolic execution.

Heap Modeling

Complex heap states are difficult to model symbolically. Use-after-free bugs, heap overflows with specific allocation patterns, and heap grooming attacks often require concrete heap layouts. What to do: For heap-focused analysis, fuzzing often works better than symbolic execution.

Floating Point

SMT solvers historically struggled with floating-point constraints. Z3 now supports the theory of floating point, but it remains a performance bottleneck. Code heavy in floating-point operations may cause solver timeouts. What to do: Concretize floating-point values when possible, or use fuzzing for floating-point-heavy code.

When to Use Fuzzing Instead

Coverage-guided fuzzing (AFL++, libFuzzer, honggfuzz) often outperforms symbolic execution for finding bugs in real software. Fuzzers are fast and require minimal setup, but they struggle with complex constraints, exactly where symbolic execution shines. In my experience, the best results come from combining both. Driller, for example, uses fuzzing for broad exploration and invokes symbolic execution when fuzzing gets stuck on tight constraints. The fuzzer covers easy paths quickly; the solver handles the hard ones. For comprehensive vulnerability discovery, don’t choose between fuzzing and formal methods. Use both.

Building Your Toolkit

Start with Z3 for pure constraint problems. It’s immediately useful for CTF challenges and simple RE puzzles. Graduate to angr when you need to analyze actual binaries. Add fuzzing (AFL++, libFuzzer) for vulnerability discovery. The tools work best in combination. Manual analysis identifies targets; formal methods solve specific problems; fuzzing provides coverage. No single tool does everything. In Part 3, we’ll explore security research applications: vulnerability discovery, exploit development, patch analysis, and more.

Overview

TLA+

Alloy

AD Tier Model Specifications

Binary Analysis Series

Malware Analysis

Aviation Systems Series

Tools

Formal Methods for Binary Analysis: Tools and Workflows

Concolic Execution: The Practical Paradigm

The Tool Ecosystem

angr

Z3

Triton

Manticore

Ghidra and Binary Ninja

Practical Workflow

1. Triage in Ghidra/IDA/Binary Ninja

2. Set Up Symbolic State

3. Define Success and Failure

4. Constrain the Search

5. Interpret Results

Limitations

Path Explosion

Environment Modeling

Constraint Complexity

Indirect Control Flow

Heap Modeling

Floating Point

When to Use Fuzzing Instead

Building Your Toolkit

Overview

TLA+

Alloy

AD Tier Model Specifications

Binary Analysis Series

Malware Analysis

Aviation Systems Series

Tools

​Concolic Execution: The Practical Paradigm

​The Tool Ecosystem

​angr

​Z3

​Triton

​Manticore

​Ghidra and Binary Ninja

​Practical Workflow

​1. Triage in Ghidra/IDA/Binary Ninja

​2. Set Up Symbolic State

​3. Define Success and Failure

​4. Constrain the Search

​5. Interpret Results

​Limitations

​Path Explosion

​Environment Modeling

​Constraint Complexity

​Indirect Control Flow

​Heap Modeling

​Floating Point

​When to Use Fuzzing Instead

​Building Your Toolkit

Concolic Execution: The Practical Paradigm

The Tool Ecosystem

angr

Z3

Triton

Manticore

Ghidra and Binary Ninja

Practical Workflow

1. Triage in Ghidra/IDA/Binary Ninja

2. Set Up Symbolic State

3. Define Success and Failure

4. Constrain the Search

5. Interpret Results

Limitations

Path Explosion

Environment Modeling

Constraint Complexity

Indirect Control Flow

Heap Modeling

Floating Point

When to Use Fuzzing Instead

Building Your Toolkit