FATE icon indicating copy to clipboard operation
FATE copied to clipboard

intersection error

Open JasonBian opened this issue 2 years ago • 2 comments

kubefate fate version:1.7.0 intersection protocol:raw, guest:25w ids, host:200millions ids, raw prama:[join_role:guest]

[ERROR][2022-07-07 12:32:37,687][eggpair-command-server_2,pid:189211,tid:139757097752320][command_router.py:93.dispatch] - Failed to dispatch to [v1/egg-pair/runTask], task_name: withStores, request: [<ErTask(id=202207070747524457910_intersection_0_0_guest_10000-py-job-20220707.122237.555236_withStores-task-6, name=withStores, inputs=[<ErPartition(id=6, store_locator=<ErStoreLocator(id=0, store_type=LMDB, namespace=202207070747524457910_intersection_0_0, name=__rsk#202207070747524457910_intersection_0_0#hash.0ac40efff5d0b1ef1568.id_ciphertext_list_exchange_h2g#fit#host#9999#guest#10000, path=, total_partitions=8, partitioner=BYTESTRING_HASH, serdes=PICKLE) at 0x7f1bb7158c50>, processor=<ErProcessor(id=2058, server_node_id=2, name=, processor_type=egg_pair, status=RUNNING, command_endpoint=<ErEndpoint(host=nodemanager-0, port=41202) at 0x7f1bb7158cc0>, transfer_endpoint=<ErEndpoint(host=nodemanager-0, port=38476) at 0x7f1bb7158cf8>, pid=189211, options=[{}], tag=) at 0x7f1bb7158d30>, rank_in_node=2) at 0x7f1bb7158d68>], outputs=[], job=<ErJob(id=202207070747524457910_intersection_0_0_guest_10000-py-job-20220707.122237.555236_withStores, name=withStores, inputs=[<ErStore(store_locator=<ErStoreLocator(id=0, store_type=LMDB, namespace=202207070747524457910_intersection_0_0, name=__rsk#202207070747524457910_intersection_0_0#hash.0ac40efff5d0b1ef1568.id_ciphertext_list_exchange_h2g#fit#host#9999#guest#10000, path=, total_partitions=8, partitioner=BYTESTRING_HASH, serdes=PICKLE) at 0x7f1bb7158e10>, partitions=[***, len=8], options=[{'python.path': '/opt/rh/rh-nodejs10/root/usr/lib/python2.7/site-packages:$PYTHONPATH:/data/projects/fate/fate/python:/data/projects/fate/eggroll/python:/data/projects/fate/fateflow/python:/data/projects/fate/fate/python/fate_client:/data/projects/fate/fate/python', 'python.venv': '/opt/app-root', 'total_partitions': '8', 'eggroll.session.processors.per.node': '4', 'serdes': 'PICKLE', 'eggroll.session.id': '202207070747524457910_intersection_0_0_guest_10000', 'create_if_missing': 'False', 'eggroll.session.deploy.mode': 'cluster'}]) at 0x7f1bb7158e48>], outputs=[], functors=[1], options={'__op': 'get_partition_status'}) at 0x7f1bb7158a20>) at 0x7f1b37f97dd8>] Traceback (most recent call last): File "/data/projects/fate/eggroll/python/eggroll/core/utils.py", line 187, in wrapper return func(*args, **kw) File "/data/projects/fate/eggroll/python/eggroll/roll_pair/egg_pair.py", line 658, in run_task value=self.functor_serdes.serialize(f(task))) File "/data/projects/fate/eggroll/python/eggroll/core/serdes/eggroll_serdes.py", line 58, in serialize return cloudpickle.dumps(_obj) File "/opt/app-root/lib/python3.6/site-packages/cloudpickle/cloudpickle.py", line 931, in dumps cp.dump(obj) File "/opt/app-root/lib/python3.6/site-packages/cloudpickle/cloudpickle.py", line 284, in dump return Pickler.dump(self, obj) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 409, in dump self.save(obj) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 521, in save self.save_reduce(obj=obj, *rv) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 606, in save_reduce save(args) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 476, in save f(self, obj) # Call unbound method with explicit self File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 751, in save_tuple save(element) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 521, in save self.save_reduce(obj=obj, *rv) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 631, in save_reduce self._batch_setitems(dictitems) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 841, in _batch_setitems tmp = list(islice(it, self._BATCHSIZE)) RuntimeError: dictionary changed size during iteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/data/projects/fate/eggroll/python/eggroll/core/command/command_router.py", line 91, in dispatch call_result = _method(_instance, *deserialized_args) File "/data/projects/fate/eggroll/python/eggroll/core/utils.py", line 194, in wrapper raise RuntimeError(msg) RuntimeError:

==== detail start, at 20220707.123237.648 ==== Traceback (most recent call last): File "./fate/eggroll/python/eggroll/core/utils.py", line 187, in wrapper return func(*args, **kw) File "./fate/eggroll/python/eggroll/roll_pair/egg_pair.py", line 658, in run_task value=self.functor_serdes.serialize(f(task))) File "./fate/eggroll/python/eggroll/core/serdes/eggroll_serdes.py", line 58, in serialize return cloudpickle.dumps(_obj) File "/opt/app-root/lib/python3.6/site-packages/cloudpickle/cloudpickle.py", line 931, in dumps cp.dump(obj) File "/opt/app-root/lib/python3.6/site-packages/cloudpickle/cloudpickle.py", line 284, in dump return Pickler.dump(self, obj) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 409, in dump self.save(obj) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 521, in save self.save_reduce(obj=obj, *rv) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 606, in save_reduce save(args) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 476, in save f(self, obj) # Call unbound method with explicit self File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 751, in save_tuple save(element) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 521, in save self.save_reduce(obj=obj, *rv) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 631, in save_reduce self._batch_setitems(dictitems) File "/opt/rh/rh-python36/root/usr/lib64/python3.6/pickle.py", line 841, in _batch_setitems tmp = list(islice(it, self._BATCHSIZE)) RuntimeError: dictionary changed size during iteration ==== detail end ====

JasonBian avatar Jul 13 '22 03:07 JasonBian

when I set join_row:host,it runs ok

JasonBian avatar Jul 13 '22 03:07 JasonBian

when I user DH intersection also have the same problem

JasonBian avatar Jul 14 '22 07:07 JasonBian

Is there still a problem after upgrading to FATE 1.8 or higher? Did you solve the problem?

kanppa avatar Dec 14 '22 06:12 kanppa