cc_net
cc_net copied to clipboard
Failing to use mp execution
I am trying to use the MPExecutor but I am getting the following error:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/alferre/anaconda3/envs/mtdev/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/data1/alferre/cc_net/cc_net/execution.py", line 145, in global_fn
return f(*args[1:])
File "/data1/alferre/cc_net/cc_net/mine.py", line 347, in _mine_shard
output=tmp_output if not conf.will_split else None,
File "/data1/alferre/cc_net/cc_net/jsonql.py", line 435, in run_pipes
initargs=(transform,),
File "/home/alferre/anaconda3/envs/mtdev/lib/python3.7/multiprocessing/context.py", line 119, in Pool
context=self.get_context())
File "/home/alferre/anaconda3/envs/mtdev/lib/python3.7/multiprocessing/pool.py", line 176, in __init__
self._repopulate_pool()
File "/home/alferre/anaconda3/envs/mtdev/lib/python3.7/multiprocessing/pool.py", line 241, in _repopulate_pool
w.start()
File "/home/alferre/anaconda3/envs/mtdev/lib/python3.7/multiprocessing/process.py", line 110, in start
'daemonic processes are not allowed to have children'
AssertionError: daemonic processes are not allowed to have children
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/alferre/anaconda3/envs/mtdev/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/alferre/anaconda3/envs/mtdev/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/data1/alferre/cc_net/cc_net/__main__.py", line 24, in <module>
main()
File "/data1/alferre/cc_net/cc_net/__main__.py", line 20, in main
func_argparse.parse_and_call(parser)
File "/home/alferre/anaconda3/envs/mtdev/lib/python3.7/site-packages/func_argparse/__init__.py", line 72, in parse_and_call
return command(**parsed_args)
File "/data1/alferre/cc_net/cc_net/mine.py", line 509, in main
regroup(conf)
File "/data1/alferre/cc_net/cc_net/mine.py", line 364, in regroup
mine(conf)
File "/data1/alferre/cc_net/cc_net/mine.py", line 271, in mine
ex(_mine_shard, repeat(conf), hashes_files, *_transpose(missing_outputs))
File "/data1/alferre/cc_net/cc_net/execution.py", line 174, in __call__
global_fn, zip(itertools.repeat(f_name), *args)
File "/home/alferre/anaconda3/envs/mtdev/lib/python3.7/multiprocessing/pool.py", line 748, in next
raise value
AssertionError: daemonic processes are not allowed to have children
I am running the following command
python -m cc_net mine --config /home/alferre/data/cc_net/config/config_alex.json
And this is my config file:
{
"output_dir": "/home/alferre/data/cc_net/data_alex",
"dump": "2019-09",
"num_shards": 1,
"num_segments_per_shard": 1,
"hash_in_mem": 2,
"mine_num_processes": 4,
"lang_whitelist": [
"pt"
],
"execution": "mp",
"target_size": "32M",
"cleanup_after_regroup": false
}