FATE
FATE copied to clipboard
10亿规模数据集隐私集合求交报错
Describe the bug 使用FATE-TEST生成双方10规模数据集,交集50%。
硬件配置 两方各三台服务器,单台硬件配置:cpu 40c, 内存:256g,磁盘1.8T SSD 单方部署情况:rollsite nodemanager clustermanager部署在同一台服务器上,另外两台为nodemanager
To Reproduce 1、使用DH和RSA算法,均可能复现 2、配置task_cores: 96, compute_partition:96
错误日志如下
==== detail start, at 20220726.203003.988 ==== Traceback (most recent call last): File "/home/fate/data/projects/fate/eggroll/python/eggroll/core/utils.py", line 187, in wrapper return func(*args, **kw) File "/home/fate/data/projects/fate/eggroll/python/eggroll/roll_pair/egg_pair.py", line 690, in run_task value=self.functor_serdes.serialize(f(task))) File "/home/fate/data/projects/fate/eggroll/python/eggroll/core/serdes/eggroll_serdes.py", line 58, in serialize return cloudpickle.dumps(_obj) File "/home/fate/data/projects/fate/common/python/venv/lib/python3.6/site-packages/cloudpickle/cloudpickle.py", line 931, in dumps cp.dump(obj) File "/home/fate/data/projects/fate/common/python/venv/lib/python3.6/site-packages/cloudpickle/cloudpickle.py", line 284, in dump return Pickler.dump(self, obj) File "/home/fate/data/projects/fate/common/miniconda3/lib/python3.6/pickle.py", line 409, in dump self.save(obj) File "/home/fate/data/projects/fate/common/miniconda3/lib/python3.6/pickle.py", line 521, in save self.save_reduce(obj=obj, *rv) File "/home/fate/data/projects/fate/common/miniconda3/lib/python3.6/pickle.py", line 606, in save_reduce save(args) File "/home/fate/data/projects/fate/common/miniconda3/lib/python3.6/pickle.py", line 476, in save f(self, obj) # Call unbound method with explicit self File "/home/fate/data/projects/fate/common/miniconda3/lib/python3.6/pickle.py", line 751, in save_tuple save(element) File "/home/fate/data/projects/fate/common/miniconda3/lib/python3.6/pickle.py", line 521, in save self.save_reduce(obj=obj, *rv) File "/home/fate/data/projects/fate/common/miniconda3/lib/python3.6/pickle.py", line 631, in save_reduce self._batch_setitems(dictitems) File "/home/fate/data/projects/fate/common/miniconda3/lib/python3.6/pickle.py", line 841, in _batch_setitems tmp = list(islice(it, self._BATCHSIZE)) RuntimeError: dictionary changed size during iteration
能看到dh或者rsa 哪快调用触发这个错误吗?
能看到dh或者rsa 哪快调用触发这个错误吗?
目前看了eggroll的日志,没找到其他的异常日志 @dylan-fan