fuzzbench icon indicating copy to clipboard operation
fuzzbench copied to clipboard

Local experiment issue : ERRO[0612] error waiting for container: unexpected EOF - [SOLVED]

Open Microsvuln opened this issue 4 years ago • 13 comments

Hi.

I have a problem running local experiments, I get the following error while building benchmarks after running this :

(.venv) arash@fuzzbench-scale1:~/new/fuzzbench$ PYTHONPATH=. python3 experiment/run_experiment.py --experiment-config experiment-config.yaml --benchmarks bloaty_fuzz_target harfbuzz-1.3.2 libjpeg-turbo-07-2017 libpcap_fuzz_both libpng-1.2.56 libxml2-v2.9.2 --experiment-name $EXPERIMENT_NAME --fuzzers afl aflql

And the error log (summarized) :

INFO:root:Building using (<function build_measurer at 0x7f41267d5430>): [('bloaty_fuzz_target',), ('harfbuzz-1.3.2',), ('libjpeg-turbo-07-2017',), ('libpcap_fuzz_both',), ('libpng-1.2.56',), ('libxml2-v2.9.2',)]
INFO:root:Building measurer for benchmark: bloaty_fuzz_target.
INFO:root:Building measurer for benchmark: harfbuzz-1.3.2.
INFO:root:Building measurer for benchmark: libjpeg-turbo-07-2017.
INFO:root:Building measurer for benchmark: libpcap_fuzz_both.
INFO:root:Building measurer for benchmark: libpng-1.2.56.
INFO:root:Building measurer for benchmark: libxml2-v2.9.2.
INFO:root:Done building measurer for benchmark: libpcap_fuzz_both.
INFO:root:Done building measurer for benchmark: bloaty_fuzz_target.
INFO:root:Done building measurer for benchmark: libpng-1.2.56.
INFO:root:Done building measurer for benchmark: libjpeg-turbo-07-2017.
INFO:root:Done building measurer for benchmark: harfbuzz-1.3.2.
INFO:root:Done building measurer for benchmark: libxml2-v2.9.2.
INFO:root:Build successes: [('bloaty_fuzz_target',), ('harfbuzz-1.3.2',), ('libjpeg-turbo-07-2017',), ('libpcap_fuzz_both',), ('libpng-1.2.56',), ('libxml2-v2.9.2',)]
INFO:root:Done building measurers.
INFO:root:Building all fuzzer benchmarks.
INFO:root:Building using (<function build_fuzzer_benchmark at 0x7f41267d5670>): [('afl', 'bloaty_fuzz_target'), ('afl', 'harfbuzz-1.3.2'), ('afl', 'libjpeg-turbo-07-2017'), ('afl', 'libpcap_fuzz_both'), ('afl', 'libpng-1.2.56'), ('afl', 'libxml2-v2.9.2'), ('aflql', 'bloaty_fuzz_target'), ('aflql', 'harfbuzz-1.3.2'), ('aflql', 'libjpeg-turbo-07-2017'), ('aflql', 'libpcap_fuzz_both'), ('aflql', 'libpng-1.2.56'), ('aflql', 'libxml2-v2.9.2')]
INFO:root:Building benchmark: bloaty_fuzz_target, fuzzer: afl.
INFO:root:Building benchmark: harfbuzz-1.3.2, fuzzer: afl.
INFO:root:Building benchmark: libjpeg-turbo-07-2017, fuzzer: afl.
INFO:root:Building benchmark: libpcap_fuzz_both, fuzzer: afl.
INFO:root:Building benchmark: libpng-1.2.56, fuzzer: afl.
INFO:root:Building benchmark: libxml2-v2.9.2, fuzzer: afl.
INFO:root:Building benchmark: bloaty_fuzz_target, fuzzer: aflql.
INFO:root:Building benchmark: harfbuzz-1.3.2, fuzzer: aflql.
INFO:root:Building benchmark: libjpeg-turbo-07-2017, fuzzer: aflql.
INFO:root:Building benchmark: libpcap_fuzz_both, fuzzer: aflql.
INFO:root:Building benchmark: libpng-1.2.56, fuzzer: aflql.
INFO:root:Building benchmark: libxml2-v2.9.2, fuzzer: aflql.
ERRO[0585] error waiting for container: unexpected EOF  
ERROR:root:Executed command: "docker run -ti --rm -v /var/run/docker.sock:/var/run/docker.sock -v /home/arash/test/experiment-data-zhest2:/home/arash/test/experiment-data-zhest2 -v /home/arash/test/report-data-zhest2:/home/arash/test/report-data-zhest2 -e INSTANCE_NAME=d-zhest2 -e EXPERIMENT=zhest2 -e SQL_DATABASE_URL=sqlite:////home/arash/test/experiment-data-zhest2/local.db?check_same_thread=False -e EXPERIMENT_FILESTORE=/home/arash/test/experiment-data-zhest2 -e REPORT_FILESTORE=/home/arash/test/report-data-zhest2 -e DOCKER_REGISTRY=gcr.io/fuzzbench -e LOCAL_EXPERIMENT=True --cap-add=SYS_PTRACE --cap-add=SYS_NICE --name=dispatcher-container gcr.io/fuzzbench/dispatcher-image /bin/bash -c rsync -r "${EXPERIMENT_FILESTORE}/${EXPERIMENT}/input/" ${WORK} && mkdir ${WORK}/src && tar -xvzf ${WORK}/src.tar.gz -C ${WORK}/src && PYTHONPATH=${WORK}/src python3 ${WORK}/src/experiment/dispatcher.py || /bin/bash" returned: 125.
Traceback (most recent call last):
  File "experiment/run_experiment.py", line 524, in <module>
    sys.exit(main())
  File "experiment/run_experiment.py", line 512, in main
    start_experiment(args.experiment_name,
  File "experiment/run_experiment.py", line 237, in start_experiment
    start_dispatcher(config, CONFIG_DIR)
  File "experiment/run_experiment.py", line 246, in start_dispatcher
    dispatcher.start()
  File "experiment/run_experiment.py", line 396, in start
    return new_process.execute(command, write_to_stdout=True)
  File "/home/arash/new/fuzzbench/common/new_process.py", line 124, in execute
    raise subprocess.CalledProcessError(retcode, command)
subprocess.CalledProcessError: Command '['docker', 'run', '-ti', '--rm', '-v', '/var/run/docker.sock:/var/run/docker.sock', '-v', '/home/arash/test/experiment-data-zhest2:/home/arash/test/experiment-data-zhest2', '-v', '/home/arash/test/report-data-zhest2:/home/arash/test/report-data-zhest2', '-e', 'INSTANCE_NAME=d-zhest2', '-e', 'EXPERIMENT=zhest2', '-e', 'SQL_DATABASE_URL=sqlite:////home/arash/test/experiment-data-zhest2/local.db?check_same_thread=False', '-e', 'EXPERIMENT_FILESTORE=/home/arash/test/experiment-data-zhest2', '-e', 'REPORT_FILESTORE=/home/arash/test/report-data-zhest2', '-e', 'DOCKER_REGISTRY=gcr.io/fuzzbench', '-e', 'LOCAL_EXPERIMENT=True', '--cap-add=SYS_PTRACE', '--cap-add=SYS_NICE', '--name=dispatcher-container', 'gcr.io/fuzzbench/dispatcher-image', '/bin/bash', '-c', 'rsync -r "${EXPERIMENT_FILESTORE}/${EXPERIMENT}/input/" ${WORK} && mkdir ${WORK}/src && tar -xvzf ${WORK}/src.tar.gz -C ${WORK}/src && PYTHONPATH=${WORK}/src python3 ${WORK}/src/experiment/dispatcher.py || /bin/bash']' returned non-zero exit status 125.

I don't know why this is happening but the number of fuzzers is 2 and the number of benchmarks is 6, even though I want to run many more of fuzzers and benchmarks.

Any solution to this?

Thanks.

Microsvuln avatar Nov 09 '20 06:11 Microsvuln

What are the specs of your machine? How many cores and how much RAM? I think this issue is https://github.com/moby/moby/issues/36324 If this is the case, adding in some resource control to local experiments should help.

jonathanmetzman avatar Nov 09 '20 14:11 jonathanmetzman

@jonathanmetzman

96 Cores of CPU and 64GB of RAM.

Had no problem with this machine before, I don't know if fuzzbench updates affect this.

Will try this and let you know about it.

Microsvuln avatar Nov 09 '20 15:11 Microsvuln

Hmmm...Maybe not then. One possible issue with builds on a single machine could be that each docker build might use as many cores as possible (e.g. make -j $NPROC). Maybe we need to limit this?

jonathanmetzman avatar Nov 09 '20 15:11 jonathanmetzman

@jonathanmetzman

Nope, I can confirm that I found how to reproduce this and how to solve this problem.

Description A fuzzbench local experiment would not be run successfully and will raise the ERROR[0585] error waiting for container: unexpected EOF error if the following condition is met :

If you add your private fuzzer to the fuzzbench fuzzer directory BEFORE you install the fuzzbench using the following commands:

$ make install-dependencies
$ source .venv/bin/activate
$ make presubmit

Here we assume that the Fuzzbench user has a customized/private fuzzer and aims to add his fuzzer to fuzzbench and run a local experiment.

Solution

  1. Clone the fuzzbench github repository
  2. Make sure that you didn't add anything to fuzzbench before installing it.
  3. Try to install the fuzzbench as mentioned in the documentation.
  4. After you installed the fuzzbench successfully, then try to add your fuzzer directory to fuzzer directory of fuzzbench and check it using the following commands :
$ make format
$ make presubmit

Now the experiment will be run successfully.

P.S 1 . Please make sure that your fuzzer has been added to fuzzbench AFTER the fuzzbench installation process.

P.S 2 . I managed to solve most of the local experiment issues I faced, if you like I can open a troubleshooting section in the fuzzbench documentation and add them along with solutions so a fuzzbench user has the chance to find the solution more quickly.

Problem solved, issue closed.

Thanks.

Microsvuln avatar Nov 09 '20 21:11 Microsvuln

I need to understand what happened here, so reopening.

inferno-chromium avatar Nov 09 '20 21:11 inferno-chromium

I can see the same issue on my side with this commit (ed0dba78f150373955e835866ed8c84d7669dae1) when use the run script to start the experiment. But the command "make-debug--" runs fine.

ERRO[0591] error waiting for container: unexpected EOF
ERROR:root:Executed command: "docker run -ti --rm -v /var/run/docker.sock:/var/run/docker.sock -v /home/cju/experiment-data:/home/cju/experiment-data -v /home/cju/report-data:/home/cju/report-data -e INSTANCE_NAME=d-kirenenko -e EXPERIMENT=kirenenko -e SQL_DATABASE_URL=
sqlite:////home/cju/experiment-data/local.db?check_same_thread=False -e EXPERIMENT_FILESTORE=/home/cju/experiment-data -e REPORT_FILESTORE=/home/cju/report-data -e DOCKER_REGISTRY=gcr.io/fuzzbench -e LOCAL_EXPERIMENT=True --cap-add=SYS_PTRACE --cap-add=SYS_NICE --name=d
ispatcher-container gcr.io/fuzzbench/dispatcher-image /bin/bash -c rsync -r "${EXPERIMENT_FILESTORE}/${EXPERIMENT}/input/" ${WORK} && mkdir ${WORK}/src && tar -xvzf ${WORK}/src.tar.gz -C ${WORK}/src && PYTHONPATH=${WORK}/src python3 ${WORK}/src/experiment/dispatcher.py
|| /bin/bash" returned: 125.
Traceback (most recent call last):
  File "experiment/run_experiment.py", line 524, in <module>
    sys.exit(main())
  File "experiment/run_experiment.py", line 512, in main
    start_experiment(args.experiment_name,
  File "experiment/run_experiment.py", line 237, in start_experiment
    start_dispatcher(config, CONFIG_DIR)
  File "experiment/run_experiment.py", line 246, in start_dispatcher
    dispatcher.start()
  File "experiment/run_experiment.py", line 396, in start
    return new_process.execute(command, write_to_stdout=True)
  File "/home/cju/tmp/fuzzbench/common/new_process.py", line 124, in execute
    raise subprocess.CalledProcessError(retcode, command)
subprocess.CalledProcessError: Command '['docker', 'run', '-ti', '--rm', '-v', '/var/run/docker.sock:/var/run/docker.sock', '-v', '/home/cju/experiment-data:/home/cju/experiment-data', '-v', '/home/cju/report-data:/home/cju/report-data', '-e', 'INSTANCE_NAME=d-kirenenko
', '-e', 'EXPERIMENT=kirenenko', '-e', 'SQL_DATABASE_URL=sqlite:////home/cju/experiment-data/local.db?check_same_thread=False', '-e', 'EXPERIMENT_FILESTORE=/home/cju/experiment-data', '-e', 'REPORT_FILESTORE=/home/cju/report-data', '-e', 'DOCKER_REGISTRY=gcr.io/fuzzbenc
h', '-e', 'LOCAL_EXPERIMENT=True', '--cap-add=SYS_PTRACE', '--cap-add=SYS_NICE', '--name=dispatcher-container', 'gcr.io/fuzzbench/dispatcher-image', '/bin/bash', '-c', 'rsync -r "${EXPERIMENT_FILESTORE}/${EXPERIMENT}/input/" ${WORK} && mkdir ${WORK}/src && tar -xvzf ${W
ORK}/src.tar.gz -C ${WORK}/src && PYTHONPATH=${WORK}/src python3 ${WORK}/src/experiment/dispatcher.py || /bin/bash']' returned non-zero exit status 125.

chenju2k6 avatar Jan 07 '21 18:01 chenju2k6

I can see the same issue on my side with this commit (ed0dba7) when use the run script to start the experiment. But the command "make-debug--" runs fine.

ERRO[0591] error waiting for container: unexpected EOF
ERROR:root:Executed command: "docker run -ti --rm -v /var/run/docker.sock:/var/run/docker.sock -v /home/cju/experiment-data:/home/cju/experiment-data -v /home/cju/report-data:/home/cju/report-data -e INSTANCE_NAME=d-kirenenko -e EXPERIMENT=kirenenko -e SQL_DATABASE_URL=
sqlite:////home/cju/experiment-data/local.db?check_same_thread=False -e EXPERIMENT_FILESTORE=/home/cju/experiment-data -e REPORT_FILESTORE=/home/cju/report-data -e DOCKER_REGISTRY=gcr.io/fuzzbench -e LOCAL_EXPERIMENT=True --cap-add=SYS_PTRACE --cap-add=SYS_NICE --name=d
ispatcher-container gcr.io/fuzzbench/dispatcher-image /bin/bash -c rsync -r "${EXPERIMENT_FILESTORE}/${EXPERIMENT}/input/" ${WORK} && mkdir ${WORK}/src && tar -xvzf ${WORK}/src.tar.gz -C ${WORK}/src && PYTHONPATH=${WORK}/src python3 ${WORK}/src/experiment/dispatcher.py
|| /bin/bash" returned: 125.
Traceback (most recent call last):
  File "experiment/run_experiment.py", line 524, in <module>
    sys.exit(main())
  File "experiment/run_experiment.py", line 512, in main
    start_experiment(args.experiment_name,
  File "experiment/run_experiment.py", line 237, in start_experiment
    start_dispatcher(config, CONFIG_DIR)
  File "experiment/run_experiment.py", line 246, in start_dispatcher
    dispatcher.start()
  File "experiment/run_experiment.py", line 396, in start
    return new_process.execute(command, write_to_stdout=True)
  File "/home/cju/tmp/fuzzbench/common/new_process.py", line 124, in execute
    raise subprocess.CalledProcessError(retcode, command)
subprocess.CalledProcessError: Command '['docker', 'run', '-ti', '--rm', '-v', '/var/run/docker.sock:/var/run/docker.sock', '-v', '/home/cju/experiment-data:/home/cju/experiment-data', '-v', '/home/cju/report-data:/home/cju/report-data', '-e', 'INSTANCE_NAME=d-kirenenko
', '-e', 'EXPERIMENT=kirenenko', '-e', 'SQL_DATABASE_URL=sqlite:////home/cju/experiment-data/local.db?check_same_thread=False', '-e', 'EXPERIMENT_FILESTORE=/home/cju/experiment-data', '-e', 'REPORT_FILESTORE=/home/cju/report-data', '-e', 'DOCKER_REGISTRY=gcr.io/fuzzbenc
h', '-e', 'LOCAL_EXPERIMENT=True', '--cap-add=SYS_PTRACE', '--cap-add=SYS_NICE', '--name=dispatcher-container', 'gcr.io/fuzzbench/dispatcher-image', '/bin/bash', '-c', 'rsync -r "${EXPERIMENT_FILESTORE}/${EXPERIMENT}/input/" ${WORK} && mkdir ${WORK}/src && tar -xvzf ${W
ORK}/src.tar.gz -C ${WORK}/src && PYTHONPATH=${WORK}/src python3 ${WORK}/src/experiment/dispatcher.py || /bin/bash']' returned non-zero exit status 125.

@chenju2k6 - can you please provide steps to reproduce.

inferno-chromium avatar Jan 07 '21 18:01 inferno-chromium

Sure. This error happens if I add customized new fuzzers: exp_fuzzer1, exp_fuzzer2 . However, the unit testing "make-exp_fuzzer1-xxx" runs fine. I am wondering if there are any other sanity checks we can run before we schedule an experiment?

cd fuzzbench
make install-dependencies
source .venv/bin/activate
./run.sh

run.sh

EXPERIMENT_NAME=exp_long
PYTHONPATH=. python3 experiment/run_experiment.py \
--experiment-config run.yaml \
--benchmarks libxml2-v2.9.2 libpng-1.2.56 libjpeg-turbo-07-2017 sqlite3_ossfuzz \
--experiment-name $EXPERIMENT_NAME \
--fuzzers honggfuzz weizz_qemu aflplusplus_optimal exp_fuzzer1 exp_fuzzer2

run.yaml

# The number of trials of a fuzzer-benchmark pair.
trials: 5

# The amount of time in seconds that each trial is run for.
# 1 day = 24 * 60 * 60 = 86400
max_total_time: 43200

# The location of the docker registry.
# FIXME: Support custom docker registry.
# See https://github.com/google/fuzzbench/issues/777
docker_registry: gcr.io/fuzzbench

# The local experiment folder that will store most of the experiment data.
# Please use an absolute path.
experiment_filestore: /home/cju/experiment-data

# The local report folder where HTML reports and summary data will be stored.
# Please use an absolute path.
report_filestore: /home/cju/report-data

# Flag that indicates this is a local experiment.
local_experiment: true

chenju2k6 avatar Jan 07 '21 18:01 chenju2k6

Any news on this? I am experiencing the problem and I am willing to help debugging. I have tried the procedure suggested above, but it did not help. What does help is reducing the number of benchmarks being targeted.

EliaGeretto avatar May 17 '21 17:05 EliaGeretto

I have the same problem.

ERRO[0413] error waiting for container: unexpected EOF  
ERROR:root:Executed command: "docker run -ti --rm -v /var/run/docker.sock:/var/run/docker.sock -v /tmp/experiment-data:/tmp/experiment-data -v /tmp/report-data:/tmp/report-data -e INSTANCE_NAME=d-test-local-fuzzbench -e EXPERIMENT=test-local-fuzzbench -e SQL_DATABASE_URL=sqlite:////tmp/experiment-data/local.db?check_same_thread=False -e EXPERIMENT_FILESTORE=/tmp/experiment-data -e REPORT_FILESTORE=/tmp/report-data -e DOCKER_REGISTRY=gcr.io/fuzzbench -e LOCAL_EXPERIMENT=True --cap-add=SYS_PTRACE --cap-add=SYS_NICE --name=dispatcher-container gcr.io/fuzzbench/dispatcher-image /bin/bash -c rsync -r "${EXPERIMENT_FILESTORE}/${EXPERIMENT}/input/" ${WORK} && mkdir ${WORK}/src && tar -xvzf ${WORK}/src.tar.gz -C ${WORK}/src && PYTHONPATH=${WORK}/src python3 ${WORK}/src/experiment/dispatcher.py || /bin/bash" returned: 125.
Traceback (most recent call last):
  File "experiment/run_experiment.py", line 553, in <module>
    sys.exit(main())
  File "experiment/run_experiment.py", line 540, in main
    start_experiment(args.experiment_name,
  File "experiment/run_experiment.py", line 242, in start_experiment
    return start_experiment_from_full_config(config)
  File "experiment/run_experiment.py", line 257, in start_experiment_from_full_config
    start_dispatcher(config, experiment_utils.CONFIG_DIR)
  File "experiment/run_experiment.py", line 266, in start_dispatcher
    dispatcher.start()
  File "experiment/run_experiment.py", line 414, in start
    return new_process.execute(command, write_to_stdout=True)
  File "/home/fedotoff/fuzzbench/common/new_process.py", line 124, in execute
    raise subprocess.CalledProcessError(retcode, command)
subprocess.CalledProcessError: Command '['docker', 'run', '-ti', '--rm', '-v', '/var/run/docker.sock:/var/run/docker.sock', '-v', '/tmp/experiment-data:/tmp/experiment-data', '-v', '/tmp/report-data:/tmp/report-data', '-e', 'INSTANCE_NAME=d-test-local-fuzzbench', '-e', 'EXPERIMENT=test-local-fuzzbench', '-e', 'SQL_DATABASE_URL=sqlite:////tmp/experiment-data/local.db?check_same_thread=False', '-e', 'EXPERIMENT_FILESTORE=/tmp/experiment-data', '-e', 'REPORT_FILESTORE=/tmp/report-data', '-e', 'DOCKER_REGISTRY=gcr.io/fuzzbench', '-e', 'LOCAL_EXPERIMENT=True', '--cap-add=SYS_PTRACE', '--cap-add=SYS_NICE', '--name=dispatcher-container', 'gcr.io/fuzzbench/dispatcher-image', '/bin/bash', '-c', 'rsync -r "${EXPERIMENT_FILESTORE}/${EXPERIMENT}/input/" ${WORK} && mkdir ${WORK}/src && tar -xvzf ${WORK}/src.tar.gz -C ${WORK}/src && PYTHONPATH=${WORK}/src python3 ${WORK}/src/experiment/dispatcher.py || /bin/bash']' returned non-zero exit status 125.

It depends on how many benchmarks and fuzzers I choose. I don't use custom fuzzers only supported once. For example this run string fails:

PYTHONPATH=. python3 experiment/run_experiment.py \
--experiment-config ../experiment-config.yaml \
--benchmarks lcms-2017-03-21 freetype2-2017 libpng-1.2.56 \
--experiment-name test-local-fuzzbench \
--fuzzers aflplusplus libfuzzer fuzzolic_aflplusplus_z3

But this string works fine (removed one benchmark):

PYTHONPATH=. python3 experiment/run_experiment.py \
--experiment-config ../experiment-config.yaml \
--benchmarks lcms-2017-03-21 freetype2-2017  \
--experiment-name test-local-fuzzbench \
--fuzzers aflplusplus libfuzzer fuzzolic_aflplusplus_z3

My machine has 128cpu 500gb memory.

anfedotoff avatar Jul 27 '21 15:07 anfedotoff

Same here, any update?

wtdcode avatar Mar 10 '23 14:03 wtdcode

No update from me at least. Unfortunately we don't use this feature much because we run all of our experiments in the cloud, so it's unlikely to get a ton of attention.

jonathanmetzman avatar Mar 14 '23 16:03 jonathanmetzman

Is it possible that this is related with running multiple containers at the same time ?

I was hitting this issue consistently (with 30 concurrent builders and almost all benchmarks), and then added a random sleep before launching each docker container and everything started to work.

mvanotti avatar Jul 03 '23 17:07 mvanotti