dolma
dolma copied to clipboard
make_wikipedia.py fails on linux
Traceback (most recent call last):
File "/home/peter/kode/dolma/dolma_env/lib/python3.11/site-packages/dolma/core/parallel.py", line 283, in _multiprocessing_run_all
multiprocessing.set_start_method("spawn")
File "/usr/lib/python3.11/multiprocessing/context.py", line 247, in set_start_method
raise RuntimeError('context has already been set')
RuntimeError: context has already been set
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/peter/kode/dolma/scripts/make_wikipedia.py", line 289, in <module>
main()
File "/home/peter/kode/dolma/scripts/make_wikipedia.py", line 285, in main
processor(date=args.date, lang=args.lang)
File "/home/peter/kode/dolma/dolma_env/lib/python3.11/site-packages/dolma/core/parallel.py", line 390, in __call__
fn(
File "/home/peter/kode/dolma/dolma_env/lib/python3.11/site-packages/dolma/core/parallel.py", line 285, in _multiprocessing_run_all
assert multiprocessing.get_start_method() == "spawn", "Multiprocessing start method must be spawn"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: Multiprocessing start method must be spawn
The bug can be fixed by setting
multiprocessing.set_start_method("spawn")
in the __main__
environment.
Perhaps the dolma core/parallel.py should use multiprocessing.get_context("spawn")
to avoid this.