1 ================================
2 Fuzzing LLVM libraries and tools
3 ================================
12 The LLVM tree includes a number of fuzzers for various components. These are
13 built on top of :doc:`LibFuzzer <LibFuzzer>`.
22 A |generic fuzzer| that tries to compile textual input as C++ code. Some of the
23 bugs this fuzzer has reported are `on bugzilla`__ and `on OSS Fuzz's
26 __ https://llvm.org/pr23057
27 __ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-fuzzer
32 A |protobuf fuzzer| that compiles valid C++ programs generated from a protobuf
33 class that describes a subset of the C++ language.
35 This fuzzer accepts clang command line options after `ignore_remaining_args=1`.
36 For example, the following command will fuzz clang with a higher optimization
41 % bin/clang-proto-fuzzer <corpus-dir> -ignore_remaining_args=1 -O3
46 A |generic fuzzer| that runs clang-format_ on C++ text fragments. Some of the
47 bugs this fuzzer has reported are `on bugzilla`__
48 and `on OSS Fuzz's tracker`__.
50 .. _clang-format: https://clang.llvm.org/docs/ClangFormat.html
51 __ https://llvm.org/pr23052
52 __ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-format-fuzzer
57 A |generic fuzzer| that tries to parse text as :doc:`LLVM assembly <LangRef>`.
58 Some of the bugs this fuzzer has reported are `on bugzilla`__.
60 __ https://llvm.org/pr24639
65 A |generic fuzzer| that interprets inputs as object files and runs
66 :doc:`llvm-dwarfdump <CommandGuide/llvm-dwarfdump>` on them. Some of the bugs
67 this fuzzer has reported are `on OSS Fuzz's tracker`__
69 __ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+llvm-dwarfdump-fuzzer
74 A |generic fuzzer| for the Itanium demangler used in various LLVM tools. We've
75 fuzzed __cxa_demangle to death, why not fuzz LLVM's implementation of the same
81 A |LLVM IR fuzzer| aimed at finding bugs in instruction selection.
83 This fuzzer accepts flags after `ignore_remaining_args=1`. The flags match
84 those of :doc:`llc <CommandGuide/llc>` and the triple is required. For example,
85 the following command would fuzz AArch64 with :doc:`GlobalISel`:
89 % bin/llvm-isel-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple aarch64 -global-isel -O0
91 Some flags can also be specified in the binary name itself in order to support
92 OSS Fuzz, which has trouble with required arguments. To do this, you can copy
93 or move ``llvm-isel-fuzzer`` to ``llvm-isel-fuzzer--x-y-z``, separating options
94 from the binary name using "--". The valid options are architecture names
95 (``aarch64``, ``x86_64``), optimization levels (``O0``, ``O2``), or specific
96 keywords, like ``gisel`` for enabling global instruction selection. In this
97 mode, the same example could be run like so:
101 % bin/llvm-isel-fuzzer--aarch64-O0-gisel <corpus-dir>
106 A |LLVM IR fuzzer| aimed at finding bugs in optimization passes.
108 It receives optimzation pipeline and runs it for each fuzzer input.
110 Interface of this fuzzer almost directly mirrors ``llvm-isel-fuzzer``. Both
111 ``mtriple`` and ``passes`` arguments are required. Passes are specified in a
112 format suitable for the new pass manager.
114 .. code-block:: shell
116 % bin/llvm-opt-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple x86_64 -passes instcombine
118 Similarly to the ``llvm-isel-fuzzer`` arguments in some predefined configurations
119 might be embedded directly into the binary file name:
121 .. code-block:: shell
123 % bin/llvm-opt-fuzzer--x86_64-instcombine <corpus-dir>
125 llvm-mc-assemble-fuzzer
126 -----------------------
128 A |generic fuzzer| that fuzzes the MC layer's assemblers by treating inputs as
129 target specific assembly.
131 Note that this fuzzer has an unusual command line interface which is not fully
132 compatible with all of libFuzzer's features. Fuzzer arguments must be passed
133 after ``--fuzzer-args``, and any ``llc`` flags must use two dashes. For
134 example, to fuzz the AArch64 assembler you might use the following command:
136 .. code-block:: console
138 llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4
140 This scheme will likely change in the future.
142 llvm-mc-disassemble-fuzzer
143 --------------------------
145 A |generic fuzzer| that fuzzes the MC layer's disassemblers by treating inputs
146 as assembled binary data.
148 Note that this fuzzer has an unusual command line interface which is not fully
149 compatible with all of libFuzzer's features. See the notes above about
150 ``llvm-mc-assemble-fuzzer`` for details.
153 .. |generic fuzzer| replace:: :ref:`generic fuzzer <fuzzing-llvm-generic>`
155 replace:: :ref:`libprotobuf-mutator based fuzzer <fuzzing-llvm-protobuf>`
157 replace:: :ref:`structured LLVM IR fuzzer <fuzzing-llvm-ir>`
160 Mutators and Input Generators
161 =============================
163 The inputs for a fuzz target are generated via random mutations of a
164 :ref:`corpus <libfuzzer-corpus>`. There are a few options for the kinds of
165 mutations that a fuzzer in LLVM might want.
167 .. _fuzzing-llvm-generic:
169 Generic Random Fuzzing
170 ----------------------
172 The most basic form of input mutation is to use the built in mutators of
173 LibFuzzer. These simply treat the input corpus as a bag of bits and make random
174 mutations. This type of fuzzer is good for stressing the surface layers of a
175 program, and is good at testing things like lexers, parsers, or binary
178 Some of the in-tree fuzzers that use this type of mutator are `clang-fuzzer`_,
179 `clang-format-fuzzer`_, `llvm-as-fuzzer`_, `llvm-dwarfdump-fuzzer`_,
180 `llvm-mc-assemble-fuzzer`_, and `llvm-mc-disassemble-fuzzer`_.
182 .. _fuzzing-llvm-protobuf:
184 Structured Fuzzing using ``libprotobuf-mutator``
185 ------------------------------------------------
187 We can use libprotobuf-mutator_ in order to perform structured fuzzing and
188 stress deeper layers of programs. This works by defining a protobuf class that
189 translates arbitrary data into structurally interesting input. Specifically, we
190 use this to work with a subset of the C++ language and perform mutations that
191 produce valid C++ programs in order to exercise parts of clang that are more
192 interesting than parser error handling.
194 To build this kind of fuzzer you need `protobuf`_ and its dependencies
195 installed, and you need to specify some extra flags when configuring the build
196 with :doc:`CMake <CMake>`. For example, `clang-proto-fuzzer`_ can be enabled by
197 adding ``-DCLANG_ENABLE_PROTO_FUZZER=ON`` to the flags described in
198 :ref:`building-fuzzers`.
200 The only in-tree fuzzer that uses ``libprotobuf-mutator`` today is
201 `clang-proto-fuzzer`_.
203 .. _libprotobuf-mutator: https://github.com/google/libprotobuf-mutator
204 .. _protobuf: https://github.com/google/protobuf
208 Structured Fuzzing of LLVM IR
209 -----------------------------
211 We also use a more direct form of structured fuzzing for fuzzers that take
212 :doc:`LLVM IR <LangRef>` as input. This is achieved through the ``FuzzMutate``
213 library, which was `discussed at EuroLLVM 2017`_.
215 The ``FuzzMutate`` library is used to structurally fuzz backends in
218 .. _discussed at EuroLLVM 2017: https://www.youtube.com/watch?v=UBbQ_s6hNgg
224 .. _building-fuzzers:
226 Configuring LLVM to Build Fuzzers
227 ---------------------------------
229 Fuzzers will be built and linked to libFuzzer by default as long as you build
230 LLVM with sanitizer coverage enabled. You would typically also enable at least
231 one sanitizer to find bugs faster. The most common way to build the fuzzers is
232 by adding the following two flags to your CMake invocation:
233 ``-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On``.
235 .. note:: If you have ``compiler-rt`` checked out in an LLVM tree when building
236 with sanitizers, you'll want to specify ``-DLLVM_BUILD_RUNTIME=Off``
237 to avoid building the sanitizers themselves with sanitizers enabled.
239 Continuously Running and Finding Bugs
240 -------------------------------------
242 There used to be a public buildbot running LLVM fuzzers continuously, and while
243 this did find issues, it didn't have a very good way to report problems in an
244 actionable way. Because of this, we're moving towards using `OSS Fuzz`_ more
247 You can browse the `LLVM project issue list`_ for the bugs found by
248 `LLVM on OSS Fuzz`_. These are also mailed to the `llvm-bugs mailing
251 .. _OSS Fuzz: https://github.com/google/oss-fuzz
252 .. _LLVM project issue list:
253 https://bugs.chromium.org/p/oss-fuzz/issues/list?q=Proj-llvm
254 .. _LLVM on OSS Fuzz:
255 https://github.com/google/oss-fuzz/blob/master/projects/llvm
256 .. _llvm-bugs mailing list:
257 http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
260 Utilities for Writing Fuzzers
261 =============================
263 There are some utilities available for writing fuzzers in LLVM.
265 Some helpers for handling the command line interface are available in
266 ``include/llvm/FuzzMutate/FuzzerCLI.h``, including functions to parse command
267 line options in a consistent way and to implement standalone main functions so
268 your fuzzer can be built and tested when not built against libFuzzer.
270 There is also some handling of the CMake config for fuzzers, where you should
271 use the ``add_llvm_fuzzer`` to set up fuzzer targets. This function works
272 similarly to functions such as ``add_llvm_tool``, but they take care of linking
273 to LibFuzzer when appropriate and can be passed the ``DUMMY_MAIN`` argument to
274 enable standalone testing.