From 8193d9328e1a71c87e7102a77538eeecb1a52f66 Mon Sep 17 00:00:00 2001 From: Kostya Serebryany Date: Wed, 2 Sep 2015 22:44:46 +0000 Subject: [PATCH] [libFuzzer] remove the section about '-tokens' as I am going to deprecate this option git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246728 91177308-0d34-0410-b5e6-96231b3b80d8 --- docs/LibFuzzer.rst | 24 +----------------------- 1 file changed, 1 insertion(+), 23 deletions(-) diff --git a/docs/LibFuzzer.rst b/docs/LibFuzzer.rst index ccf404df95a..dfcf06d7030 100644 --- a/docs/LibFuzzer.rst +++ b/docs/LibFuzzer.rst @@ -64,8 +64,6 @@ The most important flags are:: save_minimized_corpus 0 If 1, the minimized corpus is saved into the first input directory jobs 0 Number of jobs to run. If jobs >= 1 we spawn this number of jobs in separate worker processes with stdout/stderr redirected to fuzz-JOB.log. workers 0 Number of simultaneous worker processes to run the jobs. If zero, "min(jobs,NumberOfCpuCores()/2)" is used. - tokens 0 Use the file with tokens (one token per line) to fuzz a token based input language. - apply_tokens 0 Read the given input file, substitute bytes with tokens and write the result to stdout. sync_command 0 Execute an external command " " to synchronize the test corpus. sync_timeout 600 Minimum timeout between syncs. use_traces 0 Experimental: use instruction traces @@ -258,24 +256,6 @@ Voila:: Advanced features ================= -Tokens ------- - -By default, the fuzzer is not aware of complexities of the input language -and when fuzzing e.g. a C++ parser it will mostly stress the lexer. -It is very hard for the fuzzer to come up with something like ``reinterpret_cast`` -from a test corpus that doesn't have it. -See a detailed discussion of this topic at -http://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html. - -lib/Fuzzer implements a simple technique that allows to fuzz input languages with -long tokens. All you need is to prepare a text file containing up to 253 tokens, one token per line, -and pass it to the fuzzer as ``-tokens=TOKENS_FILE.txt``. -Three implicit tokens are added: ``" "``, ``"\t"``, and ``"\n"``. -The fuzzer itself will still be mutating a string of bytes -but before passing this input to the target library it will replace every byte ``b`` with the ``b``-th token. -If there are less than ``b`` tokens, a space will be added instead. - Data-flow-guided fuzzing ------------------------ @@ -341,8 +321,7 @@ Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=23052 clang-fuzzer ------------ -The default behavior is very similar to ``clang-format-fuzzer``. -Clang can also be fuzzed with Tokens_ using ``-tokens=$LLVM/lib/Fuzzer/cxx_fuzzer_tokens.txt`` option. +The behavior is very similar to ``clang-format-fuzzer``. Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=23057 @@ -366,7 +345,6 @@ The corpuses are stored in git on github and can be used like this:: git clone https://github.com/kcc/fuzzing-with-sanitizers.git bin/clang-format-fuzzer fuzzing-with-sanitizers/llvm/clang-format/C1 bin/clang-fuzzer fuzzing-with-sanitizers/llvm/clang/C1/ - bin/clang-fuzzer fuzzing-with-sanitizers/llvm/clang/TOK1 -tokens=$LLVM/llvm/lib/Fuzzer/cxx_fuzzer_tokens.txt bin/llvm-as-fuzzer fuzzing-with-sanitizers/llvm/llvm-as/C1 -only_ascii=1 -- 2.34.1