LibAFL Add multicore `load_initial_inputs` (for `Launcher`, etc.)

Right now, the corpus is loaded by each of the executors in parallel at the start of the run and "new entries" discovered by corpus loading are reported across all nodes. Instances should either not report new finds during corpus loading, or one instance should load all the corpus entries and share it with others.

An example of the effects of this behaviour: chart of corpus over time

Significant jumps indicate the first status report of each node, which had been previously processing "new entry" reports from each peer and effectively stalled.

Dec 20 '23 18:12 addisoncrump

I've also had success dividing the corpus across cores...

        if state.must_load_initial_inputs() {
            for input_dir in &opt.input {
                let sample_list: Vec<_> = input_dir
                    .read_dir()
                    .expect("Couldn't read input dir")
                    .collect();
                let chunk_size =
                    (sample_list.len() as f64 / cores.ids.len() as f64).ceil() as usize;
                println!("samples: {} chunk_size: {}", sample_list.len(), chunk_size);
                state.load_initial_inputs_by_filenames(
                    &mut fuzzer,
                    &mut executor,
                    &mut restarting_mgr,
                    &sample_list
                        .iter()
                        .skip(chunk_size * (_core_id.0 - cores.ids.first().unwrap().0))
                        .take(chunk_size)
                        .map(|x| x.as_ref().unwrap().path())
                        .collect::<Vec<_>>(),
                );
            }
            println!("We imported {} inputs from disk.", state.corpus().count());
        }

Dec 21 '23 06:12 s1341

I've done similar chunking in the past, maybe we want a speialc load_initial_inputs function that can take the current id and the total count of nodes(?)

Dec 21 '23 15:12 domenukk

Okay; I will update the associated fuzzer examples in the coming weeks.

Dec 30 '23 15:12 addisoncrump

Hello! I'd like to work on this issue but I'm not sure if chunking would be a good approach since the current implementation for corpus loading also considers nested directories, making it impossible to determine the size of the corpus and chunk appropriately.

The example shared above loads all the paths in memory, so it's kinda unnecessary to have an iterator in next_file as we're loading all paths into a vector anyways before load_initial_inputs.

Maybe waiting for one executor to load the whole corpus is too slow so perhaps it makes sense to drop the iterator implementation altogether and simplify the code?

Also, why not make the "chunked" function default in case of multi-core fuzzing?

Feb 28 '24 19:02 R9295

The example above is merely an example, it's not ideal, for example it may fail when one of the inputs crashes the target or times out. That being said, we can probably simplify some things here and there... Feel free to play around with it :)

Making the chunked function default may be rather hard, since the fuzzer, i.e., the calls to corpus loading, are provided by the user (but we should totally change the examples once we have something new)

Feb 29 '24 03:02 domenukk

The reason that we switched to an iterator implementation was that, for large corpora, the memory consumed by the paths alone was causing issues. For example, I have several targets for which I pull in a corpus from a webcrawler, then cmin with LibAFL. Loading in all the paths naïvely was more memory than my system had available!

If we want to use chunking by loading lists of files, then we'll need to model file trees more efficiently (e.g. by actually using a tree structure) and splitting it, or by walking the directories multiple times.

Also, why not make the "chunked" function default in case of multi-core fuzzing?

As @domenukk suggested, this is because the fuzzer doesn't have awareness of the other instances -- which is intentional, since they should operate independently most of the time.

Feb 29 '24 09:02 addisoncrump

Thanks @domenukk and @addisoncrump . I'm not sure how to approach this in a multicore fashion without coordination among executor instances, so I would implement the initial suggestion, where one executor loads all entries and shares them with the rest.

Feb 29 '24 12:02 R9295