secretpad icon indicating copy to clipboard operation
secretpad copied to clipboard

PSI:KKRT 和RR22 协议OOM

Open gxcuit opened this issue 1 year ago • 10 comments

Hi, 我使用secretpad, docker 进行p2p 进行部署,版本信息如下

secretpadImage版本:0.7.1b0
secretflowServingImage版本:0.3.1b0
kusciaImage版本:0.8.0b0
secretflowImage版本:1.6.1b0

我发现,psi 只有ecdh协议正常,kkrt和rr22 报OOM的错误. 数据规模不大,1000左右。 docker 的内存提高到了8g

image

日志如下

stdout 信息如下

Details
2024-06-11T18:30:03.624597928+08:00 stderr F WARNING:root:Since the GPL-licensed package `unidecode` is not installed, using Python's `unicodedata` package which yields worse results.
2024-06-11T18:30:05.619162898+08:00 stdout F 2024-06-11 10:30:05,618|alice|INFO|secretflow|entry.py:start_ray:59| ray_conf: RayConfig(ray_node_ip_address='mgey-jjhodnxr-node-35-0-global.alice.svc', ray_node_manager_port=21688, ray_object_manager_port=21689, ray_client_server_port=21690, ray_worker_ports=[], ray_gcs_port=21687)
2024-06-11T18:30:05.619255176+08:00 stdout F 2024-06-11 10:30:05,618|alice|INFO|secretflow|entry.py:start_ray:63| Trying to start ray head node at mgey-jjhodnxr-node-35-0-global.alice.svc, start command: RAY_BACKEND_LOG_LEVEL=debug RAY_grpc_enable_http_proxy=true OMP_NUM_THREADS=10 ray start --head --include-dashboard=false --disable-usage-stats --num-cpus=32 --node-ip-address=mgey-jjhodnxr-node-35-0-global.alice.svc --port=21687 --node-manager-port=21688 --object-manager-port=21689 --ray-client-server-port=21690
2024-06-11T18:30:11.093719452+08:00 stdout F 2024-06-11 10:30:11,093|alice|INFO|secretflow|entry.py:start_ray:80| 2024-06-11 10:30:07,455	INFO usage_lib.py:423 -- Usage stats collection is disabled.
2024-06-11T18:30:11.093772301+08:00 stdout F 2024-06-11 10:30:07,455	INFO scripts.py:744 -- Local node IP: mgey-jjhodnxr-node-35-0-global.alice.svc
2024-06-11T18:30:11.093789497+08:00 stdout F 2024-06-11 10:30:10,781	SUCC scripts.py:781 -- --------------------
2024-06-11T18:30:11.093804098+08:00 stdout F 2024-06-11 10:30:10,781	SUCC scripts.py:782 -- Ray runtime started.
2024-06-11T18:30:11.093818148+08:00 stdout F 2024-06-11 10:30:10,781	SUCC scripts.py:783 -- --------------------
2024-06-11T18:30:11.093832778+08:00 stdout F 2024-06-11 10:30:10,781	INFO scripts.py:785 -- Next steps
2024-06-11T18:30:11.093847126+08:00 stdout F 2024-06-11 10:30:10,781	INFO scripts.py:788 -- To add another node to this Ray cluster, run
2024-06-11T18:30:11.093861489+08:00 stdout F 2024-06-11 10:30:10,781	INFO scripts.py:791 --   ray start --address='mgey-jjhodnxr-node-35-0-global.alice.svc:21687'
2024-06-11T18:30:11.093875424+08:00 stdout F 2024-06-11 10:30:10,781	INFO scripts.py:800 -- To connect to this Ray cluster:
2024-06-11T18:30:11.093897427+08:00 stdout F 2024-06-11 10:30:10,782	INFO scripts.py:802 -- import ray
2024-06-11T18:30:11.093912298+08:00 stdout F 2024-06-11 10:30:10,782	INFO scripts.py:803 -- ray.init(_node_ip_address='mgey-jjhodnxr-node-35-0-global.alice.svc')
2024-06-11T18:30:11.093926778+08:00 stdout F 2024-06-11 10:30:10,782	INFO scripts.py:834 -- To terminate the Ray runtime, run
2024-06-11T18:30:11.093940778+08:00 stdout F 2024-06-11 10:30:10,782	INFO scripts.py:835 --   ray stop
2024-06-11T18:30:11.093954762+08:00 stdout F 2024-06-11 10:30:10,782	INFO scripts.py:838 -- To view the status of the cluster, use
2024-06-11T18:30:11.093968842+08:00 stdout F 2024-06-11 10:30:10,782	INFO scripts.py:839 --   ray status
2024-06-11T18:30:11.093982077+08:00 stdout F 
2024-06-11T18:30:11.094037591+08:00 stdout F 2024-06-11 10:30:11,093|alice|INFO|secretflow|entry.py:start_ray:81| Succeeded to start ray head node at mgey-jjhodnxr-node-35-0-global.alice.svc.
2024-06-11T18:30:11.095507431+08:00 stdout F 2024-06-11 10:30:11,095|alice|INFO|secretflow|entry.py:main:510| datasource.access_directly True
2024-06-11T18:30:11.095543235+08:00 stdout F sf_node_eval_param  {
2024-06-11T18:30:11.09555942+08:00 stdout F   "domain": "data_prep",
2024-06-11T18:30:11.09557397+08:00 stdout F   "name": "psi",
2024-06-11T18:30:11.095587977+08:00 stdout F   "version": "0.0.5",
2024-06-11T18:30:11.095601854+08:00 stdout F   "attrPaths": [
2024-06-11T18:30:11.095616687+08:00 stdout F     "input/receiver_input/key",
2024-06-11T18:30:11.095631122+08:00 stdout F     "input/sender_input/key",
2024-06-11T18:30:11.095645052+08:00 stdout F     "protocol",
2024-06-11T18:30:11.095658946+08:00 stdout F     "sort_result",
2024-06-11T18:30:11.095672656+08:00 stdout F     "allow_duplicate_keys",
2024-06-11T18:30:11.095687033+08:00 stdout F     "allow_duplicate_keys/no/skip_duplicates_check",
2024-06-11T18:30:11.095740815+08:00 stdout F     "fill_value_int",
2024-06-11T18:30:11.095756731+08:00 stdout F     "ecdh_curve"
2024-06-11T18:30:11.095770846+08:00 stdout F   ],
2024-06-11T18:30:11.095785259+08:00 stdout F   "attrs": [
2024-06-11T18:30:11.095799456+08:00 stdout F     {
2024-06-11T18:30:11.095813277+08:00 stdout F       "ss": [
2024-06-11T18:30:11.095827333+08:00 stdout F         "id"
2024-06-11T18:30:11.095841514+08:00 stdout F       ]
2024-06-11T18:30:11.095855494+08:00 stdout F     },
2024-06-11T18:30:11.095869291+08:00 stdout F     {
2024-06-11T18:30:11.095883025+08:00 stdout F       "ss": [
2024-06-11T18:30:11.095896825+08:00 stdout F         "id2"
2024-06-11T18:30:11.095910689+08:00 stdout F       ]
2024-06-11T18:30:11.095924956+08:00 stdout F     },
2024-06-11T18:30:11.095938756+08:00 stdout F     {
2024-06-11T18:30:11.095952634+08:00 stdout F       "s": "PROTOCOL_KKRT"
2024-06-11T18:30:11.095966497+08:00 stdout F     },
2024-06-11T18:30:11.095980224+08:00 stdout F     {
2024-06-11T18:30:11.095993992+08:00 stdout F       "b": true
2024-06-11T18:30:11.096007948+08:00 stdout F     },
2024-06-11T18:30:11.096021709+08:00 stdout F     {
2024-06-11T18:30:11.096035559+08:00 stdout F       "s": "no"
2024-06-11T18:30:11.096049353+08:00 stdout F     },
2024-06-11T18:30:11.096063047+08:00 stdout F     {
2024-06-11T18:30:11.096132769+08:00 stdout F       "b": true
2024-06-11T18:30:11.096153762+08:00 stdout F     },
2024-06-11T18:30:11.096168142+08:00 stdout F     {
2024-06-11T18:30:11.096182567+08:00 stdout F       "isNa": true
2024-06-11T18:30:11.096196557+08:00 stdout F     },
2024-06-11T18:30:11.096210434+08:00 stdout F     {
2024-06-11T18:30:11.096224394+08:00 stdout F       "s": "CURVE_SM2"
2024-06-11T18:30:11.096238321+08:00 stdout F     }
2024-06-11T18:30:11.096252315+08:00 stdout F   ],
2024-06-11T18:30:11.096266295+08:00 stdout F   "inputs": [
2024-06-11T18:30:11.096280286+08:00 stdout F     {
2024-06-11T18:30:11.096295053+08:00 stdout F       "type": "sf.table.individual",
2024-06-11T18:30:11.09630927+08:00 stdout F       "meta": {
2024-06-11T18:30:11.096344121+08:00 stdout F         "@type": "type.googleapis.com/secretflow.spec.v1.IndividualTable",
2024-06-11T18:30:11.096360585+08:00 stdout F         "lineCount": "-1"
2024-06-11T18:30:11.096374712+08:00 stdout F       },
2024-06-11T18:30:11.096388825+08:00 stdout F       "dataRefs": [
2024-06-11T18:30:11.096402703+08:00 stdout F         {
2024-06-11T18:30:11.096416819+08:00 stdout F           "uri": "breast_new2_590923962.csv",
2024-06-11T18:30:11.09643096+08:00 stdout F           "party": "alice",
2024-06-11T18:30:11.096445017+08:00 stdout F           "format": "csv"
2024-06-11T18:30:11.096464618+08:00 stdout F         }
2024-06-11T18:30:11.096478991+08:00 stdout F       ]
2024-06-11T18:30:11.096492855+08:00 stdout F     },
2024-06-11T18:30:11.096506506+08:00 stdout F     {
2024-06-11T18:30:11.096520246+08:00 stdout F       "type": "sf.table.individual",
2024-06-11T18:30:11.096533942+08:00 stdout F       "meta": {
2024-06-11T18:30:11.096547953+08:00 stdout F         "@type": "type.googleapis.com/secretflow.spec.v1.IndividualTable",
2024-06-11T18:30:11.09656181+08:00 stdout F         "lineCount": "-1"
2024-06-11T18:30:11.096575401+08:00 stdout F       },
2024-06-11T18:30:11.096589181+08:00 stdout F       "dataRefs": [
2024-06-11T18:30:11.096602788+08:00 stdout F         {
2024-06-11T18:30:11.096616505+08:00 stdout F           "uri": "breast_new1_1450367590.csv",
2024-06-11T18:30:11.096630195+08:00 stdout F           "party": "bob",
2024-06-11T18:30:11.096643789+08:00 stdout F           "format": "csv"
2024-06-11T18:30:11.096657463+08:00 stdout F         }
2024-06-11T18:30:11.09667103+08:00 stdout F       ]
2024-06-11T18:30:11.096684704+08:00 stdout F     }
2024-06-11T18:30:11.096698327+08:00 stdout F   ],
2024-06-11T18:30:11.096712034+08:00 stdout F   "checkpointUri": "ckmgey-jjhodnxr-node-35-output-0"
2024-06-11T18:30:11.096725882+08:00 stdout F } 
2024-06-11T18:30:11.110466395+08:00 stdout F 2024-06-11 10:30:11,109|alice|WARNING|secretflow|meta_conversion.py:convert_domain_data_to_individual_table:29| kuscia adapter has to deduce dist data from domain data at this moment.
2024-06-11T18:30:11.110794077+08:00 stdout F 2024-06-11 10:30:11,110|alice|INFO|secretflow|entry.py:domaindata_id_to_dist_data:160| domaindata_id woerffhn to 
2024-06-11T18:30:11.110823888+08:00 stdout F ...........
2024-06-11T18:30:11.110840301+08:00 stdout F name: "breast_new2"
2024-06-11T18:30:11.110855542+08:00 stdout F type: "sf.table.individual"
2024-06-11T18:30:11.110869856+08:00 stdout F meta {
2024-06-11T18:30:11.110884999+08:00 stdout F   type_url: "type.googleapis.com/secretflow.spec.v1.IndividualTable"
2024-06-11T18:30:11.110900047+08:00 stdout F   value: "\n\221\003\022\002id\022\021compactness-error\022\017concavity-error\022\024concave-points-error\022\016symmetry-error\022\027fractal-dimension-error\022\014worst-radius\022\rworst-texture\022\017worst-perimeter\022\nworst-area\022\020worst-smoothness\022\021worst-compactness\022\017worst-concavity\022\024worst-concave-points\022\016worst-symmetry\022\027worst-fractal-dimension\022\006target*\003int*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\003int\020\377\377\377\377\377\377\377\377\377\001"
2024-06-11T18:30:11.110915093+08:00 stdout F }
2024-06-11T18:30:11.110929138+08:00 stdout F data_refs {
2024-06-11T18:30:11.110943558+08:00 stdout F   uri: "breast_new2_590923962.csv"
2024-06-11T18:30:11.111130776+08:00 stdout F   party: "alice"
2024-06-11T18:30:11.11114993+08:00 stdout F   format: "csv"
2024-06-11T18:30:11.11116481+08:00 stdout F }
2024-06-11T18:30:11.1111785+08:00 stdout F 
2024-06-11T18:30:11.111192368+08:00 stdout F ....
2024-06-11T18:30:11.12376327+08:00 stdout F 2024-06-11 10:30:11,121|alice|WARNING|secretflow|meta_conversion.py:convert_domain_data_to_individual_table:29| kuscia adapter has to deduce dist data from domain data at this moment.
2024-06-11T18:30:11.123801421+08:00 stdout F 2024-06-11 10:30:11,122|alice|INFO|secretflow|entry.py:domaindata_id_to_dist_data:160| domaindata_id vwgwvxul to 
2024-06-11T18:30:11.123826399+08:00 stdout F ...........
2024-06-11T18:30:11.12384198+08:00 stdout F name: "breast_new1"
2024-06-11T18:30:11.12385695+08:00 stdout F type: "sf.table.individual"
2024-06-11T18:30:11.123871176+08:00 stdout F meta {
2024-06-11T18:30:11.123886257+08:00 stdout F   type_url: "type.googleapis.com/secretflow.spec.v1.IndividualTable"
2024-06-11T18:30:11.123995824+08:00 stdout F   value: "\n\344\002\022\003id2\022\013mean-radius\022\014mean-texture\022\016mean-perimeter\022\tmean-area\022\017mean-smoothness\022\020mean-compactness\022\016mean-concavity\022\023mean-concave-points\022\rmean-symmetry\022\026mean-fractal-dimension\022\014radius-error\022\rtexture-error\022\017perimeter-error\022\narea-error\022\020smoothness-error*\003int*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float\020\377\377\377\377\377\377\377\377\377\001"
2024-06-11T18:30:11.124024048+08:00 stdout F }
2024-06-11T18:30:11.124038755+08:00 stdout F data_refs {
2024-06-11T18:30:11.124053461+08:00 stdout F   uri: "breast_new1_1450367590.csv"
2024-06-11T18:30:11.124108403+08:00 stdout F   party: "bob"
2024-06-11T18:30:11.124130447+08:00 stdout F   format: "csv"
2024-06-11T18:30:11.124144911+08:00 stdout F }
2024-06-11T18:30:11.124158441+08:00 stdout F 
2024-06-11T18:30:11.124172732+08:00 stdout F ....
2024-06-11T18:30:11.124187219+08:00 stdout F 2024-06-11 10:30:11,122|alice|WARNING|secretflow|entry.py:comp_eval:159| 
2024-06-11T18:30:11.124201465+08:00 stdout F --
2024-06-11T18:30:11.124215623+08:00 stdout F Secretflow 1.6.1b0
2024-06-11T18:30:11.124230503+08:00 stdout F Build time (May 27 2024, 04:48:06) with commit id: eac355f390d9d0d7276ee4cea8d1fe38417cabb6
2024-06-11T18:30:11.124256124+08:00 stdout F --
2024-06-11T18:30:11.124271447+08:00 stdout F 
2024-06-11T18:30:11.124285828+08:00 stdout F 2024-06-11 10:30:11,123|alice|WARNING|secretflow|entry.py:comp_eval:160| 
2024-06-11T18:30:11.124330893+08:00 stdout F --
2024-06-11T18:30:11.124346029+08:00 stdout F *param* 
2024-06-11T18:30:11.124359197+08:00 stdout F 
2024-06-11T18:30:11.124373157+08:00 stdout F domain: "data_prep"
2024-06-11T18:30:11.124386914+08:00 stdout F name: "psi"
2024-06-11T18:30:11.124400485+08:00 stdout F version: "0.0.5"
2024-06-11T18:30:11.124414201+08:00 stdout F attr_paths: "input/receiver_input/key"
2024-06-11T18:30:11.124427739+08:00 stdout F attr_paths: "input/sender_input/key"
2024-06-11T18:30:11.124441396+08:00 stdout F attr_paths: "protocol"
2024-06-11T18:30:11.124455596+08:00 stdout F attr_paths: "sort_result"
2024-06-11T18:30:11.124469283+08:00 stdout F attr_paths: "allow_duplicate_keys"
2024-06-11T18:30:11.124485853+08:00 stdout F attr_paths: "allow_duplicate_keys/no/skip_duplicates_check"
2024-06-11T18:30:11.124499537+08:00 stdout F attr_paths: "fill_value_int"
2024-06-11T18:30:11.124513034+08:00 stdout F attr_paths: "ecdh_curve"
2024-06-11T18:30:11.124526628+08:00 stdout F attrs {
2024-06-11T18:30:11.124540282+08:00 stdout F   ss: "id"
2024-06-11T18:30:11.124553999+08:00 stdout F }
2024-06-11T18:30:11.124567832+08:00 stdout F attrs {
2024-06-11T18:30:11.124581626+08:00 stdout F   ss: "id2"
2024-06-11T18:30:11.124595173+08:00 stdout F }
2024-06-11T18:30:11.124608843+08:00 stdout F attrs {
2024-06-11T18:30:11.124622481+08:00 stdout F   s: "PROTOCOL_KKRT"
2024-06-11T18:30:11.124636451+08:00 stdout F }
2024-06-11T18:30:11.124650115+08:00 stdout F attrs {
2024-06-11T18:30:11.124663788+08:00 stdout F   b: true
2024-06-11T18:30:11.124677328+08:00 stdout F }
2024-06-11T18:30:11.124690889+08:00 stdout F attrs {
2024-06-11T18:30:11.124704846+08:00 stdout F   s: "no"
2024-06-11T18:30:11.124718446+08:00 stdout F }
2024-06-11T18:30:11.124732074+08:00 stdout F attrs {
2024-06-11T18:30:11.12474563+08:00 stdout F   b: true
2024-06-11T18:30:11.124759185+08:00 stdout F }
2024-06-11T18:30:11.124772851+08:00 stdout F attrs {
2024-06-11T18:30:11.124786391+08:00 stdout F   is_na: true
2024-06-11T18:30:11.124799999+08:00 stdout F }
2024-06-11T18:30:11.124813589+08:00 stdout F attrs {
2024-06-11T18:30:11.124827219+08:00 stdout F   s: "CURVE_SM2"
2024-06-11T18:30:11.124840967+08:00 stdout F }
2024-06-11T18:30:11.124854613+08:00 stdout F inputs {
2024-06-11T18:30:11.12486821+08:00 stdout F   name: "breast_new2"
2024-06-11T18:30:11.124881964+08:00 stdout F   type: "sf.table.individual"
2024-06-11T18:30:11.124895571+08:00 stdout F   meta {
2024-06-11T18:30:11.124909342+08:00 stdout F     type_url: "type.googleapis.com/secretflow.spec.v1.IndividualTable"
2024-06-11T18:30:11.124924079+08:00 stdout F     value: "\n\221\003\022\002id\022\021compactness-error\022\017concavity-error\022\024concave-points-error\022\016symmetry-error\022\027fractal-dimension-error\022\014worst-radius\022\rworst-texture\022\017worst-perimeter\022\nworst-area\022\020worst-smoothness\022\021worst-compactness\022\017worst-concavity\022\024worst-concave-points\022\016worst-symmetry\022\027worst-fractal-dimension\022\006target*\003int*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\003int\020\377\377\377\377\377\377\377\377\377\001"
2024-06-11T18:30:11.12494364+08:00 stdout F   }
2024-06-11T18:30:11.124958193+08:00 stdout F   data_refs {
2024-06-11T18:30:11.12497206+08:00 stdout F     uri: "breast_new2_590923962.csv"
2024-06-11T18:30:11.124985714+08:00 stdout F     party: "alice"
2024-06-11T18:30:11.124999384+08:00 stdout F     format: "csv"
2024-06-11T18:30:11.125012921+08:00 stdout F   }
2024-06-11T18:30:11.125026478+08:00 stdout F }
2024-06-11T18:30:11.125039942+08:00 stdout F inputs {
2024-06-11T18:30:11.125053586+08:00 stdout F   name: "breast_new1"
2024-06-11T18:30:11.125067219+08:00 stdout F   type: "sf.table.individual"
2024-06-11T18:30:11.125080656+08:00 stdout F   meta {
2024-06-11T18:30:11.125094307+08:00 stdout F     type_url: "type.googleapis.com/secretflow.spec.v1.IndividualTable"
2024-06-11T18:30:11.12778897+08:00 stdout F     value: "\n\344\002\022\003id2\022\013mean-radius\022\014mean-texture\022\016mean-perimeter\022\tmean-area\022\017mean-smoothness\022\020mean-compactness\022\016mean-concavity\022\023mean-concave-points\022\rmean-symmetry\022\026mean-fractal-dimension\022\014radius-error\022\rtexture-error\022\017perimeter-error\022\narea-error\022\020smoothness-error*\003int*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float\020\377\377\377\377\377\377\377\377\377\001"
2024-06-11T18:30:11.127823124+08:00 stdout F   }
2024-06-11T18:30:11.127845221+08:00 stdout F   data_refs {
2024-06-11T18:30:11.127865925+08:00 stdout F     uri: "breast_new1_1450367590.csv"
2024-06-11T18:30:11.127886046+08:00 stdout F     party: "bob"
2024-06-11T18:30:11.12790702+08:00 stdout F     format: "csv"
2024-06-11T18:30:11.127929011+08:00 stdout F   }
2024-06-11T18:30:11.127944127+08:00 stdout F }
2024-06-11T18:30:11.127958404+08:00 stdout F output_uris: "mgey-jjhodnxr-node-35-output-0"
2024-06-11T18:30:11.127979035+08:00 stdout F checkpoint_uri: "ckmgey-jjhodnxr-node-35-output-0"
2024-06-11T18:30:11.127992948+08:00 stdout F 
2024-06-11T18:30:11.128006906+08:00 stdout F --
2024-06-11T18:30:11.128020099+08:00 stdout F 
2024-06-11T18:30:11.128146517+08:00 stdout F 2024-06-11 10:30:11,123|alice|WARNING|secretflow|entry.py:comp_eval:161| 
2024-06-11T18:30:11.128169653+08:00 stdout F --
2024-06-11T18:30:11.128184444+08:00 stdout F *storage_config* 
2024-06-11T18:30:11.128197901+08:00 stdout F 
2024-06-11T18:30:11.128211708+08:00 stdout F type: "local_fs"
2024-06-11T18:30:11.128225642+08:00 stdout F local_fs {
2024-06-11T18:30:11.128240272+08:00 stdout F   wd: "/home/kuscia/var/storage/data"
2024-06-11T18:30:11.12825415+08:00 stdout F }
2024-06-11T18:30:11.1282672+08:00 stdout F 
2024-06-11T18:30:11.12828069+08:00 stdout F --
2024-06-11T18:30:11.128293804+08:00 stdout F 
2024-06-11T18:30:11.128321087+08:00 stdout F 2024-06-11 10:30:11,123|alice|WARNING|secretflow|entry.py:comp_eval:162| 
2024-06-11T18:30:11.128335955+08:00 stdout F --
2024-06-11T18:30:11.128349965+08:00 stdout F *cluster_config* 
2024-06-11T18:30:11.128363225+08:00 stdout F 
2024-06-11T18:30:11.128376816+08:00 stdout F desc {
2024-06-11T18:30:11.128390503+08:00 stdout F   parties: "bob"
2024-06-11T18:30:11.12840444+08:00 stdout F   parties: "alice"
2024-06-11T18:30:11.128418364+08:00 stdout F   devices {
2024-06-11T18:30:11.128424824+08:00 stderr F 2024-06-11 10:30:11,127	INFO worker.py:1540 -- Connecting to existing Ray cluster at address: mgey-jjhodnxr-node-35-0-global.alice.svc:21687...
2024-06-11T18:30:11.128433564+08:00 stdout F     name: "spu"
2024-06-11T18:30:11.128477552+08:00 stdout F     type: "spu"
2024-06-11T18:30:11.128493289+08:00 stdout F     parties: "bob"
2024-06-11T18:30:11.128507632+08:00 stdout F     parties: "alice"
2024-06-11T18:30:11.12853234+08:00 stdout F     config: "{\"runtime_config\":{\"protocol\":\"SEMI2K\",\"field\":\"FM128\"},\"link_desc\":{\"connect_retry_times\":60,\"connect_retry_interval_ms\":1000,\"brpc_channel_protocol\":\"http\",\"brpc_channel_connection_type\":\"pooled\",\"recv_timeout_ms\":1200000,\"http_timeout_ms\":1200000}}"
2024-06-11T18:30:11.128548331+08:00 stdout F   }
2024-06-11T18:30:11.128562661+08:00 stdout F   devices {
2024-06-11T18:30:11.128576771+08:00 stdout F     name: "heu"
2024-06-11T18:30:11.128590662+08:00 stdout F     type: "heu"
2024-06-11T18:30:11.128604452+08:00 stdout F     parties: "bob"
2024-06-11T18:30:11.128618202+08:00 stdout F     parties: "alice"
2024-06-11T18:30:11.128633+08:00 stdout F     config: "{\"mode\": \"PHEU\", \"schema\": \"paillier\", \"key_size\": 2048}"
2024-06-11T18:30:11.128647093+08:00 stdout F   }
2024-06-11T18:30:11.128661557+08:00 stdout F   ray_fed_config {
2024-06-11T18:30:11.128675927+08:00 stdout F     cross_silo_comm_backend: "brpc_link"
2024-06-11T18:30:11.128689734+08:00 stdout F   }
2024-06-11T18:30:11.128703945+08:00 stdout F }
2024-06-11T18:30:11.128717882+08:00 stdout F public_config {
2024-06-11T18:30:11.128760043+08:00 stdout F   ray_fed_config {
2024-06-11T18:30:11.128775164+08:00 stdout F     parties: "bob"
2024-06-11T18:30:11.12878921+08:00 stdout F     parties: "alice"
2024-06-11T18:30:11.128803954+08:00 stdout F     addresses: "mgey-jjhodnxr-node-35-0-fed.bob.svc:80"
2024-06-11T18:30:11.128818788+08:00 stdout F     addresses: "0.0.0.0:21686"
2024-06-11T18:30:11.128832585+08:00 stdout F   }
2024-06-11T18:30:11.128846356+08:00 stdout F   spu_configs {
2024-06-11T18:30:11.128860046+08:00 stdout F     name: "spu"
2024-06-11T18:30:11.128873699+08:00 stdout F     parties: "bob"
2024-06-11T18:30:11.128887167+08:00 stdout F     parties: "alice"
2024-06-11T18:30:11.12890074+08:00 stdout F     addresses: "http://mgey-jjhodnxr-node-35-0-spu.bob.svc:80"
2024-06-11T18:30:11.12891451+08:00 stdout F     addresses: "0.0.0.0:21691"
2024-06-11T18:30:11.128928304+08:00 stdout F   }
2024-06-11T18:30:11.128941948+08:00 stdout F }
2024-06-11T18:30:11.128955885+08:00 stdout F private_config {
2024-06-11T18:30:11.128969702+08:00 stdout F   self_party: "alice"
2024-06-11T18:30:11.128983429+08:00 stdout F   ray_head_addr: "mgey-jjhodnxr-node-35-0-global.alice.svc:21687"
2024-06-11T18:30:11.128997166+08:00 stdout F }
2024-06-11T18:30:11.129010216+08:00 stdout F 
2024-06-11T18:30:11.12902407+08:00 stdout F --
2024-06-11T18:30:11.129037027+08:00 stdout F 
2024-06-11T18:30:11.129054657+08:00 stdout F 2024-06-11 10:30:11,126|alice|WARNING|secretflow|driver.py:init:442| When connecting to an existing cluster, num_cpus must not be provided. Num_cpus is neglected at this moment.
2024-06-11T18:30:11.145467061+08:00 stdout F 2024-06-11 10:30:11,144|alice|DEBUG|secretflow|_api.py:acquire:294| Attempting to acquire lock 139795041269792 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/node_ip_address.json.lock
2024-06-11T18:30:11.146947922+08:00 stdout F 2024-06-11 10:30:11,145|alice|DEBUG|secretflow|_api.py:acquire:297| Lock 139795041269792 acquired on /tmp/ray/session_2024-06-11_10-30-07_456938_49/node_ip_address.json.lock
2024-06-11T18:30:11.146975089+08:00 stdout F 2024-06-11 10:30:11,145|alice|DEBUG|secretflow|_api.py:release:327| Attempting to release lock 139795041269792 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/node_ip_address.json.lock
2024-06-11T18:30:11.14699062+08:00 stdout F 2024-06-11 10:30:11,146|alice|DEBUG|secretflow|_api.py:release:330| Lock 139795041269792 released on /tmp/ray/session_2024-06-11_10-30-07_456938_49/node_ip_address.json.lock
2024-06-11T18:30:11.152731876+08:00 stdout F 2024-06-11 10:30:11,152|alice|DEBUG|secretflow|_api.py:acquire:294| Attempting to acquire lock 139795041269840 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.153204129+08:00 stdout F 2024-06-11 10:30:11,152|alice|DEBUG|secretflow|_api.py:acquire:297| Lock 139795041269840 acquired on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.153602909+08:00 stdout F 2024-06-11 10:30:11,153|alice|DEBUG|secretflow|_api.py:release:327| Attempting to release lock 139795041269840 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.153815218+08:00 stdout F 2024-06-11 10:30:11,153|alice|DEBUG|secretflow|_api.py:release:330| Lock 139795041269840 released on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.154064465+08:00 stdout F 2024-06-11 10:30:11,153|alice|DEBUG|secretflow|_api.py:acquire:294| Attempting to acquire lock 139795041269744 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.154569722+08:00 stdout F 2024-06-11 10:30:11,154|alice|DEBUG|secretflow|_api.py:acquire:297| Lock 139795041269744 acquired on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.154926231+08:00 stdout F 2024-06-11 10:30:11,154|alice|DEBUG|secretflow|_api.py:release:327| Attempting to release lock 139795041269744 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.155114487+08:00 stdout F 2024-06-11 10:30:11,154|alice|DEBUG|secretflow|_api.py:release:330| Lock 139795041269744 released on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.155461816+08:00 stdout F 2024-06-11 10:30:11,155|alice|DEBUG|secretflow|_api.py:acquire:294| Attempting to acquire lock 139795041269936 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.155847743+08:00 stdout F 2024-06-11 10:30:11,155|alice|DEBUG|secretflow|_api.py:acquire:297| Lock 139795041269936 acquired on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.156335524+08:00 stdout F 2024-06-11 10:30:11,155|alice|DEBUG|secretflow|_api.py:release:327| Attempting to release lock 139795041269936 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.156393651+08:00 stdout F 2024-06-11 10:30:11,156|alice|DEBUG|secretflow|_api.py:release:330| Lock 139795041269936 released on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.156742928+08:00 stdout F 2024-06-11 10:30:11,156|alice|DEBUG|secretflow|_api.py:acquire:294| Attempting to acquire lock 139795041269792 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.158334868+08:00 stderr F 2024-06-11 10:30:11,157	INFO worker.py:1724 -- Connected to Ray cluster.
2024-06-11T18:30:11.158397393+08:00 stdout F 2024-06-11 10:30:11,156|alice|DEBUG|secretflow|_api.py:acquire:297| Lock 139795041269792 acquired on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.158415243+08:00 stdout F 2024-06-11 10:30:11,157|alice|DEBUG|secretflow|_api.py:release:327| Attempting to release lock 139795041269792 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:11.158429444+08:00 stdout F 2024-06-11 10:30:11,157|alice|DEBUG|secretflow|_api.py:release:330| Lock 139795041269792 released on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock
2024-06-11T18:30:13.009225838+08:00 stderr F 2024-06-11 10:30:13.008 INFO api.py:233 [alice] -- [Anonymous_job] Started rayfed with {'CLUSTER_ADDRESSES': {'bob': 'http://mgey-jjhodnxr-node-35-0-fed.bob.svc:80', 'alice': '0.0.0.0:21686'}, 'CURRENT_PARTY_NAME': 'alice', 'TLS_CONFIG': {}}
2024-06-11T18:30:13.855544191+08:00 stderr F [33m(raylet)[0m [2024-06-11 10:30:13,810 I 660 660] logging.cc:230: Set ray log level from environment variable RAY_BACKEND_LOG_LEVEL to -1
2024-06-11T18:30:15.193184363+08:00 stderr F [36m(SenderReceiverProxyActor pid=660)[0m 2024-06-11 10:30:15.166 INFO link.py:38 [alice] -- [Anonymous_job] brpc options: {'proxy_max_restarts': 3, 'timeout_in_ms': 300000, 'recv_timeout_ms': 604800000, 'connect_retry_times': 3600, 'connect_retry_interval_ms': 1000, 'brpc_channel_protocol': 'http', 'brpc_channel_connection_type': 'pooled', 'exit_on_sending_failure': True}
2024-06-11T18:30:15.193240596+08:00 stderr F [36m(SenderReceiverProxyActor pid=660)[0m I0611 10:30:15.189406   660 external/com_github_brpc_brpc/src/brpc/server.cpp:1181] Server[yacl::link::transport::internal::ReceiverServiceImpl] is serving on port=21686.
2024-06-11T18:30:15.193254738+08:00 stderr F [36m(SenderReceiverProxyActor pid=660)[0m W0611 10:30:15.189486   660 external/com_github_brpc_brpc/src/brpc/server.cpp:1187] Builtin services are disabled according to ServerOptions.has_builtin_services
2024-06-11T18:30:15.617776759+08:00 stderr F [36m(SenderReceiverProxyActor pid=660)[0m I0611 10:30:15.543109   726 external/com_github_brpc_brpc/src/brpc/span.cpp:506] Opened ./rpc_data/rpcz/20240611.103015.660/id.db and ./rpc_data/rpcz/20240611.103015.660/time.db
2024-06-11T18:30:16.494040748+08:00 stderr F 2024-06-11 10:30:16.493 INFO barriers.py:465 [alice] -- [Anonymous_job] Succeeded to create receiver proxy actor.
2024-06-11T18:30:16.49435982+08:00 stderr F 2024-06-11 10:30:16.493 INFO barriers.py:520 [alice] -- [Anonymous_job] Try ping ['bob'] at 0 attemp, up to 3600 attemps.
2024-06-11T18:30:16.509139589+08:00 stderr F 2024-06-11 10:30:16.508 WARNING psi.py:358 [alice] -- [Anonymous_job] {'cluster_def': {'nodes': [{'party': 'bob', 'address': 'http://mgey-jjhodnxr-node-35-0-spu.bob.svc:80', 'listen_address': ''}, {'party': 'alice', 'address': '0.0.0.0:21691', 'listen_address': ''}], 'runtime_config': {'protocol': 2, 'field': 3}}, 'link_desc': {'connect_retry_times': 60, 'connect_retry_interval_ms': 1000, 'brpc_channel_protocol': 'http', 'brpc_channel_connection_type': 'pooled', 'recv_timeout_ms': 1200000, 'http_timeout_ms': 1200000}}
2024-06-11T18:30:21.438985122+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [354.628]       perfetto.cc:45899 Configured tracing session 1, #sources:1, duration:0 ms, #buffers:1, total buffer size:1024 KB, total sessions:1, uid:0 session name: ""
2024-06-11T18:30:21.439034746+08:00 stderr F [33m(raylet)[0m [2024-06-11 10:30:17,901 I 728 728] logging.cc:230: Set ray log level from environment variable RAY_BACKEND_LOG_LEVEL to -1
2024-06-11T18:30:21.553161584+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m *** SIGILL received at time=1718101821 on cpu 3 ***
2024-06-11T18:30:21.553204936+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m PC: @     0x7f2373fdfa32  (unknown)  yacl::AvxTranspose128()
2024-06-11T18:30:21.553220273+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m     @     0x7f25dc012ce0  (unknown)  (unknown)
2024-06-11T18:30:21.553266017+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m     @     0x7f2373f399ad       9536  yacl::crypto::IknpOtExtSend()
2024-06-11T18:30:21.666765888+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m     @     0x7f2373f34029        464  psi::kkrt::GetKkrtOtReceiverOptions()
2024-06-11T18:30:21.666827943+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m     @     0x7f2373d60ab0       1536  psi::RunPsi()
2024-06-11T18:30:21.666851651+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m     @     0x7f2373d56205        384  psi::BindLibs()::{lambda()#3}::operator()()
2024-06-11T18:30:21.666873831+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m     @     0x7f2373d56483        176  pybind11::cpp_function::initialize<>()::{lambda()#3}::_FUN()
2024-06-11T18:30:21.666925956+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m     @     0x7f2373d3843d        736  pybind11::cpp_function::dispatcher()
2024-06-11T18:30:21.66694947+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m     @           0x4fc697  (unknown)  cfunction_call
2024-06-11T18:30:21.666971214+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m     @ ... and at least 1 more frames
2024-06-11T18:30:21.666991882+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21,644 E 728 728] logging.cc:361: *** SIGILL received at time=1718101821 on cpu 3 ***
2024-06-11T18:30:21.667012095+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21,644 E 728 728] logging.cc:361: PC: @     0x7f2373fdfa32  (unknown)  yacl::AvxTranspose128()
2024-06-11T18:30:21.667052926+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21,644 E 728 728] logging.cc:361:     @     0x7f25dc012ce0  (unknown)  (unknown)
2024-06-11T18:30:21.667076384+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373f399ad       9536  yacl::crypto::IknpOtExtSend()
2024-06-11T18:30:21.66709696+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373f34029        464  psi::kkrt::GetKkrtOtReceiverOptions()
2024-06-11T18:30:21.667151046+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373d60ab0       1536  psi::RunPsi()
2024-06-11T18:30:21.667214121+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373d56205        384  psi::BindLibs()::{lambda()#3}::operator()()
2024-06-11T18:30:21.667341457+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373d56483        176  pybind11::cpp_function::initialize<>()::{lambda()#3}::_FUN()
2024-06-11T18:30:21.667410986+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373d3843d        736  pybind11::cpp_function::dispatcher()
2024-06-11T18:30:21.66743628+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @           0x4fc697  (unknown)  cfunction_call
2024-06-11T18:30:21.667457034+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @ ... and at least 1 more frames
2024-06-11T18:30:21.667478061+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m Fatal Python error: Illegal instruction
2024-06-11T18:30:21.667499926+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m 
2024-06-11T18:30:21.667520369+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m Stack (most recent call first):
2024-06-11T18:30:21.667573281+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m   File "/usr/local/lib/python3.10/site-packages/spu/psi.py", line 118 in psi
2024-06-11T18:30:21.667596458+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m   File "/usr/local/lib/python3.10/site-packages/secretflow/device/device/spu.py", line 1379 in psi
2024-06-11T18:30:21.667616909+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m   File "/usr/local/lib/python3.10/site-packages/ray/util/tracing/tracing_helper.py", line 467 in _resume_span
2024-06-11T18:30:21.667636415+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m   File "/usr/local/lib/python3.10/site-packages/ray/_private/function_manager.py", line 726 in actor_method_executor
2024-06-11T18:30:21.667676736+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m   File "/usr/local/lib/python3.10/site-packages/ray/_private/worker.py", line 847 in main_loop
2024-06-11T18:30:21.667698507+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m   File "/usr/local/lib/python3.10/site-packages/ray/_private/workers/default_worker.py", line 282 in <module>
2024-06-11T18:30:21.667746345+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m 
2024-06-11T18:30:21.668545347+08:00 stderr F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m Extension modules: msgpack._cmsgpack, google._upb._message, psutil._psutil_linux, psutil._psutil_posix, setproctitle, yaml._yaml, charset_normalizer.md, requests.packages.charset_normalizer.md, requests.packages.chardet.md, ray._raylet, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, jaxlib.cpu_feature_guard, grpc._cython.cygrpc, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.tslib, pandas._libs.lib, pandas._libs.hashing, pyarrow.lib, pyarrow._hdfsio, pandas._libs.ops, pyarrow._compute, pandas._libs.arrays, pandas._libs.index, pandas._libs.join, pandas._libs.sparse, pandas._libs.reduction, pandas._libs.indexing, pandas._libs.internals, pandas._libs.writers, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.tslibs.strptime, pandas._libs.groupby, pandas._libs.testing, pandas._libs.parsers, pandas._libs.json, scipy._lib._ccallback_c, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.linalg._propack._spropack, scipy.sparse.linalg._propack._dpropack, scipy.sparse.linalg._propack._cpropack, scipy.sparse.linalg._propack._zpropack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, sklearn.__check_build._check_build, scipy.special._ufuncs_cxx, scipy.special._cdflib, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.stats.ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, scipy.stats.nct_ufunc, scipy.stats._boost.nct_ufunc, scipy.stats.skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, scipy.stats.invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy._lib._uarray._uarray, scipy.stats._ansari_swilk_statistics, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._mvn, scipy.stats._rcont.rcont, scipy.stats._unuran.unuran_wrapper, sklearn.utils._isfinite, sklearn.utils.murmurhash, sklearn.utils._openmp_helpers, sklearn.utils._logistic_sigmoid, sklearn.utils.sparsefuncs_fast, sklearn.preprocessing._csr_polynomial_expansion, sklearn.preprocessing._target_encoder_fast, pyarrow._json (total: 182)
2024-06-11T18:30:22.373640364+08:00 stderr F 2024-06-11 10:30:22.373 ERROR component.py:1129 [alice] -- [Anonymous_job] eval on domain: "data_prep"
2024-06-11T18:30:22.373696425+08:00 stderr F name: "psi"
2024-06-11T18:30:22.373709411+08:00 stderr F version: "0.0.5"
2024-06-11T18:30:22.373722246+08:00 stderr F attr_paths: "input/receiver_input/key"
2024-06-11T18:30:22.373733551+08:00 stderr F attr_paths: "input/sender_input/key"
2024-06-11T18:30:22.373745182+08:00 stderr F attr_paths: "protocol"
2024-06-11T18:30:22.373757157+08:00 stderr F attr_paths: "sort_result"
2024-06-11T18:30:22.37376828+08:00 stderr F attr_paths: "allow_duplicate_keys"
2024-06-11T18:30:22.373780078+08:00 stderr F attr_paths: "allow_duplicate_keys/no/skip_duplicates_check"
2024-06-11T18:30:22.373791173+08:00 stderr F attr_paths: "fill_value_int"
2024-06-11T18:30:22.373802246+08:00 stderr F attr_paths: "ecdh_curve"
2024-06-11T18:30:22.373813534+08:00 stderr F attrs {
2024-06-11T18:30:22.373824934+08:00 stderr F   ss: "id"
2024-06-11T18:30:22.373836034+08:00 stderr F }
2024-06-11T18:30:22.373847157+08:00 stderr F attrs {
2024-06-11T18:30:22.373858282+08:00 stderr F   ss: "id2"
2024-06-11T18:30:22.37386938+08:00 stderr F }
2024-06-11T18:30:22.373880443+08:00 stderr F attrs {
2024-06-11T18:30:22.373891623+08:00 stderr F   s: "PROTOCOL_KKRT"
2024-06-11T18:30:22.373902694+08:00 stderr F }
2024-06-11T18:30:22.373914471+08:00 stderr F attrs {
2024-06-11T18:30:22.373925667+08:00 stderr F   b: true
2024-06-11T18:30:22.373936799+08:00 stderr F }
2024-06-11T18:30:22.373947895+08:00 stderr F attrs {
2024-06-11T18:30:22.373958935+08:00 stderr F   s: "no"
2024-06-11T18:30:22.373969935+08:00 stderr F }
2024-06-11T18:30:22.373981016+08:00 stderr F attrs {
2024-06-11T18:30:22.373992153+08:00 stderr F   b: true
2024-06-11T18:30:22.374003089+08:00 stderr F }
2024-06-11T18:30:22.374014054+08:00 stderr F attrs {
2024-06-11T18:30:22.374025109+08:00 stderr F   is_na: true
2024-06-11T18:30:22.37405176+08:00 stderr F }
2024-06-11T18:30:22.374064425+08:00 stderr F attrs {
2024-06-11T18:30:22.374075818+08:00 stderr F   s: "CURVE_SM2"
2024-06-11T18:30:22.374086949+08:00 stderr F }
2024-06-11T18:30:22.374098264+08:00 stderr F inputs {
2024-06-11T18:30:22.374109667+08:00 stderr F   name: "breast_new2"
2024-06-11T18:30:22.374142928+08:00 stderr F   type: "sf.table.individual"
2024-06-11T18:30:22.374155418+08:00 stderr F   meta {
2024-06-11T18:30:22.374167288+08:00 stderr F     type_url: "type.googleapis.com/secretflow.spec.v1.IndividualTable"
2024-06-11T18:30:22.374179791+08:00 stderr F     value: "\n\221\003\022\002id\022\021compactness-error\022\017concavity-error\022\024concave-points-error\022\016symmetry-error\022\027fractal-dimension-error\022\014worst-radius\022\rworst-texture\022\017worst-perimeter\022\nworst-area\022\020worst-smoothness\022\021worst-compactness\022\017worst-concavity\022\024worst-concave-points\022\016worst-symmetry\022\027worst-fractal-dimension\022\006target*\003int*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\003int\020\377\377\377\377\377\377\377\377\377\001"
2024-06-11T18:30:22.374191799+08:00 stderr F   }
2024-06-11T18:30:22.374202999+08:00 stderr F   data_refs {
2024-06-11T18:30:22.374214144+08:00 stderr F     uri: "breast_new2_590923962.csv"
2024-06-11T18:30:22.374225187+08:00 stderr F     party: "alice"
2024-06-11T18:30:22.374328268+08:00 stderr F     format: "csv"
2024-06-11T18:30:22.374349533+08:00 stderr F   }
2024-06-11T18:30:22.374361031+08:00 stderr F }
2024-06-11T18:30:22.374372016+08:00 stderr F inputs {
2024-06-11T18:30:22.374383092+08:00 stderr F   name: "breast_new1"
2024-06-11T18:30:22.374394424+08:00 stderr F   type: "sf.table.individual"
2024-06-11T18:30:22.374405627+08:00 stderr F   meta {
2024-06-11T18:30:22.374417065+08:00 stderr F     type_url: "type.googleapis.com/secretflow.spec.v1.IndividualTable"
2024-06-11T18:30:22.374478427+08:00 stderr F     value: "\n\344\002\022\003id2\022\013mean-radius\022\014mean-texture\022\016mean-perimeter\022\tmean-area\022\017mean-smoothness\022\020mean-compactness\022\016mean-concavity\022\023mean-concave-points\022\rmean-symmetry\022\026mean-fractal-dimension\022\014radius-error\022\rtexture-error\022\017perimeter-error\022\narea-error\022\020smoothness-error*\003int*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float\020\377\377\377\377\377\377\377\377\377\001"
2024-06-11T18:30:22.374510683+08:00 stderr F   }
2024-06-11T18:30:22.374522973+08:00 stderr F   data_refs {
2024-06-11T18:30:22.374534471+08:00 stderr F     uri: "breast_new1_1450367590.csv"
2024-06-11T18:30:22.374814428+08:00 stderr F     party: "bob"
2024-06-11T18:30:22.374832389+08:00 stderr F     format: "csv"
2024-06-11T18:30:22.374844124+08:00 stderr F   }
2024-06-11T18:30:22.374855364+08:00 stderr F }
2024-06-11T18:30:22.37486733+08:00 stderr F output_uris: "mgey-jjhodnxr-node-35-output-0"
2024-06-11T18:30:22.374879055+08:00 stderr F checkpoint_uri: "ckmgey-jjhodnxr-node-35-output-0"
2024-06-11T18:30:22.374890605+08:00 stderr F  failed, error <The actor died unexpectedly before finishing this task.
2024-06-11T18:30:22.374901706+08:00 stderr F 	class_name: SPURuntime
2024-06-11T18:30:22.374912831+08:00 stderr F 	actor_id: 6b8ff9d344e3fb1584fa53a701000000
2024-06-11T18:30:22.374923894+08:00 stderr F 	pid: 728
2024-06-11T18:30:22.374950424+08:00 stderr F 	namespace: 2d854d99-9d0a-4226-8f15-0b81472f9b80
2024-06-11T18:30:22.374965572+08:00 stderr F 	ip: mgey-jjhodnxr-node-35-0-global.alice.svc
2024-06-11T18:30:22.374980873+08:00 stderr F The actor is dead because its worker process has died. Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.>
2024-06-11T18:30:22.375034672+08:00 stderr F 2024-06-11 10:30:22.373 INFO api.py:342 [alice] -- [Anonymous_job] Shutdowning rayfed intendedly...
2024-06-11T18:30:22.37504974+08:00 stderr F 2024-06-11 10:30:22.373 INFO api.py:356 [alice] -- [Anonymous_job] No wait for data sending.
2024-06-11T18:30:22.376752291+08:00 stderr F 2024-06-11 10:30:22.376 INFO message_queue.py:72 [alice] -- [Anonymous_job] Notify message polling thread[DataSendingQueueThread] to exit.
2024-06-11T18:30:22.376974735+08:00 stderr F 2024-06-11 10:30:22.376 INFO message_queue.py:72 [alice] -- [Anonymous_job] Notify message polling thread[ErrorSendingQueueThread] to exit.
2024-06-11T18:30:22.3770069+08:00 stderr F 2024-06-11 10:30:22.376 INFO api.py:384 [alice] -- [Anonymous_job] Shutdowned rayfed.
2024-06-11T18:30:22.384107368+08:00 stderr F 2024-06-11 10:30:22.382 WARNING cleanup.py:154 [alice] -- [Anonymous_job] Failed to send ObjectRef(359ec6ce30d3ca2d29217c23c8b16b38f62aba790100000001000000) to bob, error: [36mray::SenderReceiverProxyActor.send()[39m (pid=660, ip=mgey-jjhodnxr-node-35-0-global.alice.svc, actor_id=29217c23c8b16b38f62aba7901000000, repr=<fed.proxy.barriers.SenderReceiverProxyActor object at 0x7f9f17e47640>)
2024-06-11T18:30:22.384150637+08:00 stderr F   At least one of the input arguments for this task could not be computed:
2024-06-11T18:30:22.384166213+08:00 stderr F ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
2024-06-11T18:30:22.384180557+08:00 stderr F 	class_name: SPURuntime
2024-06-11T18:30:22.384194198+08:00 stderr F 	actor_id: 6b8ff9d344e3fb1584fa53a701000000
2024-06-11T18:30:22.384207118+08:00 stderr F 	pid: 728
2024-06-11T18:30:22.384220075+08:00 stderr F 	namespace: 2d854d99-9d0a-4226-8f15-0b81472f9b80
2024-06-11T18:30:22.384232815+08:00 stderr F 	ip: mgey-jjhodnxr-node-35-0-global.alice.svc
2024-06-11T18:30:22.384246624+08:00 stderr F The actor is dead because its worker process has died. Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.,upstream_seq_id: 10#0, downstream_seq_id: 12.
2024-06-11T18:30:22.384757133+08:00 stderr F 2024-06-11 10:30:22.383 INFO cleanup.py:161 [alice] -- [Anonymous_job] Sending error The actor died unexpectedly before finishing this task.
2024-06-11T18:30:22.384779701+08:00 stderr F 	class_name: SPURuntime
2024-06-11T18:30:22.384794497+08:00 stderr F 	actor_id: 6b8ff9d344e3fb1584fa53a701000000
2024-06-11T18:30:22.384824194+08:00 stderr F 	pid: 728
2024-06-11T18:30:22.384838268+08:00 stderr F 	namespace: 2d854d99-9d0a-4226-8f15-0b81472f9b80
2024-06-11T18:30:22.384857401+08:00 stderr F 	ip: mgey-jjhodnxr-node-35-0-global.alice.svc
2024-06-11T18:30:22.384880026+08:00 stderr F The actor is dead because its worker process has died. Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors. to bob.
2024-06-11T18:30:22.389288694+08:00 stderr F Exception in thread DataSendingQueueThread:
2024-06-11T18:30:22.38932184+08:00 stderr F Traceback (most recent call last):
2024-06-11T18:30:22.389335686+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/fed/cleanup.py", line 152, in _process_data_sending_task_return
2024-06-11T18:30:22.389966263+08:00 stderr F     res = ray.get(obj_ref)
2024-06-11T18:30:22.389990974+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
2024-06-11T18:30:22.390478777+08:00 stderr F     return fn(*args, **kwargs)
2024-06-11T18:30:22.390500668+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
2024-06-11T18:30:22.390836932+08:00 stderr F     return func(*args, **kwargs)
2024-06-11T18:30:22.390855982+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/ray/_private/worker.py", line 2624, in get
2024-06-11T18:30:22.392126042+08:00 stderr F     raise value.as_instanceof_cause()
2024-06-11T18:30:22.392492683+08:00 stderr F ray.exceptions.RayTaskError(RayActorError): [36mray::SenderReceiverProxyActor.send()[39m (pid=660, ip=mgey-jjhodnxr-node-35-0-global.alice.svc, actor_id=29217c23c8b16b38f62aba7901000000, repr=<fed.proxy.barriers.SenderReceiverProxyActor object at 0x7f9f17e47640>)
2024-06-11T18:30:22.392521924+08:00 stderr F   At least one of the input arguments for this task could not be computed:
2024-06-11T18:30:22.39253773+08:00 stderr F ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
2024-06-11T18:30:22.392551768+08:00 stderr F 	class_name: SPURuntime
2024-06-11T18:30:22.392564953+08:00 stderr F 	actor_id: 6b8ff9d344e3fb1584fa53a701000000
2024-06-11T18:30:22.392578086+08:00 stderr F 	pid: 728
2024-06-11T18:30:22.392591034+08:00 stderr F 	namespace: 2d854d99-9d0a-4226-8f15-0b81472f9b80
2024-06-11T18:30:22.392603726+08:00 stderr F 	ip: mgey-jjhodnxr-node-35-0-global.alice.svc
2024-06-11T18:30:22.392617324+08:00 stderr F The actor is dead because its worker process has died. Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2024-06-11T18:30:22.392630099+08:00 stderr F 
2024-06-11T18:30:22.392643053+08:00 stderr F During handling of the above exception, another exception occurred:
2024-06-11T18:30:22.392655356+08:00 stderr F 
2024-06-11T18:30:22.392667903+08:00 stderr F Traceback (most recent call last):
2024-06-11T18:30:22.392680596+08:00 stderr F   File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
2024-06-11T18:30:22.39297746+08:00 stderr F     self.run()
2024-06-11T18:30:22.392999276+08:00 stderr F   File "/usr/local/lib/python3.10/threading.py", line 953, in run
2024-06-11T18:30:22.39341552+08:00 stderr F     self._target(*self._args, **self._kwargs)
2024-06-11T18:30:22.393433408+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/fed/_private/message_queue.py", line 51, in _loop
2024-06-11T18:30:22.393591242+08:00 stderr F     res = self._msg_handler(message)
2024-06-11T18:30:22.39360833+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/fed/cleanup.py", line 47, in <lambda>
2024-06-11T18:30:22.393729328+08:00 stderr F     lambda msg: self._process_data_sending_task_return(msg),
2024-06-11T18:30:22.393745566+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/fed/cleanup.py", line 166, in _process_data_sending_task_return
2024-06-11T18:30:22.394030904+08:00 stderr F     send(
2024-06-11T18:30:22.394048257+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/fed/proxy/barriers.py", line 502, in send
2024-06-11T18:30:22.394313962+08:00 stderr F     get_global_context().get_cleanup_manager().push_to_sending(
2024-06-11T18:30:22.394848841+08:00 stderr F AttributeError: 'NoneType' object has no attribute 'get_cleanup_manager'
2024-06-11T18:30:23.027373223+08:00 stderr F Traceback (most recent call last):
2024-06-11T18:30:23.027412424+08:00 stderr F   File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
2024-06-11T18:30:23.028257897+08:00 stderr F     return _run_code(code, main_globals, None,
2024-06-11T18:30:23.028292061+08:00 stderr F   File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
2024-06-11T18:30:23.028710432+08:00 stderr F     exec(code, run_globals)
2024-06-11T18:30:23.028732686+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/secretflow/kuscia/entry.py", line 547, in <module>
2024-06-11T18:30:23.029492636+08:00 stderr F     main()
2024-06-11T18:30:23.029516684+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
2024-06-11T18:30:23.030401661+08:00 stderr F     return self.main(*args, **kwargs)
2024-06-11T18:30:23.030426092+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1078, in main
2024-06-11T18:30:23.031112705+08:00 stderr F     rv = self.invoke(ctx)
2024-06-11T18:30:23.031163446+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
2024-06-11T18:30:23.03206022+08:00 stderr F     return ctx.invoke(self.callback, **ctx.params)
2024-06-11T18:30:23.032131042+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
2024-06-11T18:30:23.034130743+08:00 stderr F     return __callback(*args, **kwargs)
2024-06-11T18:30:23.034152994+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/secretflow/kuscia/entry.py", line 527, in main
2024-06-11T18:30:23.034167818+08:00 stderr F     res = comp_eval(sf_node_eval_param, storage_config, sf_cluster_config)
2024-06-11T18:30:23.034183254+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/secretflow/component/entry.py", line 166, in comp_eval
2024-06-11T18:30:23.034197909+08:00 stderr F     res = comp.eval(
2024-06-11T18:30:23.034211892+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/secretflow/component/component.py", line 1131, in eval
2024-06-11T18:30:23.034225956+08:00 stderr F     raise e from None
2024-06-11T18:30:23.034239893+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/secretflow/component/component.py", line 1126, in eval
2024-06-11T18:30:23.035161218+08:00 stderr F     ret = self.__eval_callback(ctx=ctx, **kwargs)
2024-06-11T18:30:23.035203472+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/secretflow/component/preprocessing/data_prep/psi.py", line 371, in two_party_balanced_psi_eval_fn
2024-06-11T18:30:23.035813893+08:00 stderr F     report = spu.psi(
2024-06-11T18:30:23.035861924+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/secretflow/device/device/spu.py", line 2097, in psi
2024-06-11T18:30:23.037136402+08:00 stderr F     return dispatch(
2024-06-11T18:30:23.0371656+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/secretflow/device/device/register.py", line 111, in dispatch
2024-06-11T18:30:23.037611992+08:00 stderr F     return _registrar.dispatch(self.device_type, name, self, *args, **kwargs)
2024-06-11T18:30:23.037657786+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/secretflow/device/device/register.py", line 80, in dispatch
2024-06-11T18:30:23.038128176+08:00 stderr F     return self._ops[device_type][name](*args, **kwargs)
2024-06-11T18:30:23.038152447+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/secretflow/device/kernels/spu.py", line 615, in psi
2024-06-11T18:30:23.03889247+08:00 stderr F     return sfd.get(res)
2024-06-11T18:30:23.038916451+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/secretflow/distributed/primitive.py", line 156, in get
2024-06-11T18:30:23.03952265+08:00 stderr F     return fed.get(object_refs)
2024-06-11T18:30:23.04026802+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/fed/api.py", line 621, in get
2024-06-11T18:30:23.040323982+08:00 stderr F     values = ray.get(ray_refs)
2024-06-11T18:30:23.040341076+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
2024-06-11T18:30:23.040490836+08:00 stderr F     return fn(*args, **kwargs)
2024-06-11T18:30:23.040540665+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
2024-06-11T18:30:23.041472187+08:00 stderr F     return func(*args, **kwargs)
2024-06-11T18:30:23.041498258+08:00 stderr F   File "/usr/local/lib/python3.10/site-packages/ray/_private/worker.py", line 2626, in get
2024-06-11T18:30:23.042555552+08:00 stderr F     raise value
2024-06-11T18:30:23.042608214+08:00 stderr F ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
2024-06-11T18:30:23.042626101+08:00 stderr F 	class_name: SPURuntime
2024-06-11T18:30:23.042640465+08:00 stderr F 	actor_id: 6b8ff9d344e3fb1584fa53a701000000
2024-06-11T18:30:23.042654505+08:00 stderr F 	pid: 728
2024-06-11T18:30:23.042804129+08:00 stderr F 	namespace: 2d854d99-9d0a-4226-8f15-0b81472f9b80
2024-06-11T18:30:23.042828124+08:00 stderr F 	ip: mgey-jjhodnxr-node-35-0-global.alice.svc
2024-06-11T18:30:23.042866871+08:00 stderr F The actor is dead because its worker process has died. Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.
2024-06-11T18:30:23.044350245+08:00 stdout F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21.391] [info] [launch.cc:119] PSI config: {"protocol_config":{"protocol":"PROTOCOL_KKRT","role":"ROLE_RECEIVER","broadcast_result":true},"input_config":{"type":"IO_TYPE_FILE_CSV","path":"/home/kuscia/var/storage/data/breast_new2_590923962.csv"},"output_config":{"type":"IO_TYPE_FILE_CSV","path":"/home/kuscia/var/storage/data/mgey-jjhodnxr-node-35-output-0"},"keys":["id"],"skip_duplicates_check":true,"left_side":"ROLE_RECEIVER"}
2024-06-11T18:30:23.044378596+08:00 stdout F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21.391] [info] [receiver.cc:37] [KkrtPsiReceiver::Init] start
2024-06-11T18:30:23.044393529+08:00 stdout F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21.391] [info] [interface.cc:78] [AbstractPsiParty::Init] start
2024-06-11T18:30:23.044407927+08:00 stdout F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21.397] [info] [interface.cc:136] [AbstractPsiParty::Init][Check csv pre-process] start
2024-06-11T18:30:23.04442234+08:00 stdout F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21.401] [info] [interface.cc:145] [AbstractPsiParty::Init][Check csv pre-process] end
2024-06-11T18:30:23.044464964+08:00 stdout F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21.407] [info] [interface.cc:183] [AbstractPsiParty::Init] end
2024-06-11T18:30:23.044480279+08:00 stdout F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21.407] [info] [receiver.cc:42] [KkrtPsiReceiver::Init] end
2024-06-11T18:30:23.044494139+08:00 stdout F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21.407] [info] [receiver.cc:47] [KkrtPsiReceiver::PreProcess] start
2024-06-11T18:30:23.044509146+08:00 stdout F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21.408] [info] [bucket_psi.cc:514] psi protocol=2, rank=0 item_size=569
2024-06-11T18:30:23.04452343+08:00 stdout F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21.408] [info] [bucket_psi.cc:514] psi protocol=2, rank=1 item_size=569
2024-06-11T18:30:23.044538246+08:00 stdout F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21.412] [info] [arrow_csv_batch_provider.cc:75] Reach the end of csv file /home/kuscia/var/storage/data/breast_new2_590923962.csv.
2024-06-11T18:30:23.044552107+08:00 stdout F [36m(SPURuntime(device_id=None, party=alice) pid=728)[0m [2024-06-11 10:30:21.412] [info] [arrow_csv_batch_provider.cc:75] Reach the end of csv file /home/kuscia/var/storage/data/breast_new2_590923962.csv.
2024-06-11T18:30:23.044567371+08:00 stdout F [33m(raylet)[0m A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffff6b8ff9d344e3fb1584fa53a701000000 Worker ID: b81e69dc5c23a765d1f5cc5d06e2c59a9cedc33be29ebb1466e66f27 Node ID: c62d983dc71ae67ccb66a649c2e65e1f30b1c3196521987b6ce325b6 Worker IP address: mgey-jjhodnxr-node-35-0-global.alice.svc Worker port: 10014 Worker PID: 728 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.

gxcuit avatar Jun 12 '24 04:06 gxcuit

hi @gxcuit 您可以将数据文件脱敏后发出来吗?

lq0404510 avatar Jun 12 '24 10:06 lq0404510

hi @gxcuit 您可以将数据文件脱敏后发出来吗?

Hi, @lq0404510 Thanks for your reply

breast_new1.csv breast_new2.csv

不知道是不是我机器太老的原因?

宿主机 Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz hyporvisor VMware ESXi, 6.7.0, 8169922

2024-06-11T18:30:21.439034746+08:00 stderr F �[33m(raylet)�[0m [2024-06-11 10:30:17,901 I 728 728] logging.cc:230: Set ray log level from environment variable RAY_BACKEND_LOG_LEVEL to -1
2024-06-11T18:30:21.553161584+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m *** SIGILL received at time=1718101821 on cpu 3 ***
2024-06-11T18:30:21.553204936+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m PC: @     0x7f2373fdfa32  (unknown)  yacl::AvxTranspose128()
2024-06-11T18:30:21.553220273+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @     0x7f25dc012ce0  (unknown)  (unknown)
2024-06-11T18:30:21.553266017+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @     0x7f2373f399ad       9536  yacl::crypto::IknpOtExtSend()
2024-06-11T18:30:21.666765888+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @     0x7f2373f34029        464  psi::kkrt::GetKkrtOtReceiverOptions()
2024-06-11T18:30:21.666827943+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @     0x7f2373d60ab0       1536  psi::RunPsi()
2024-06-11T18:30:21.666851651+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @     0x7f2373d56205        384  psi::BindLibs()::{lambda()#3}::operator()()
2024-06-11T18:30:21.666873831+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @     0x7f2373d56483        176  pybind11::cpp_function::initialize<>()::{lambda()#3}::_FUN()
2024-06-11T18:30:21.666925956+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @     0x7f2373d3843d        736  pybind11::cpp_function::dispatcher()
2024-06-11T18:30:21.66694947+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @           0x4fc697  (unknown)  cfunction_call
2024-06-11T18:30:21.666971214+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @ ... and at least 1 more frames
2024-06-11T18:30:21.666991882+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,644 E 728 728] logging.cc:361: *** SIGILL received at time=1718101821 on cpu 3 ***
2024-06-11T18:30:21.667012095+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,644 E 728 728] logging.cc:361: PC: @     0x7f2373fdfa32  (unknown)  yacl::AvxTranspose128()
2024-06-11T18:30:21.667052926+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,644 E 728 728] logging.cc:361:     @     0x7f25dc012ce0  (unknown)  (unknown)
2024-06-11T18:30:21.667076384+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373f399ad       9536  yacl::crypto::IknpOtExtSend()
2024-06-11T18:30:21.66709696+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373f34029        464  psi::kkrt::GetKkrtOtReceiverOptions()
2024-06-11T18:30:21.667151046+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373d60ab0       1536  psi::RunPsi()
2024-06-11T18:30:21.667214121+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373d56205        384  psi::BindLibs()::{lambda()#3}::operator()()
2024-06-11T18:30:21.667341457+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373d56483        176  pybind11::cpp_function::initialize<>()::{lambda()#3}::_FUN()
2024-06-11T18:30:21.667410986+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373d3843d        736  pybind11::cpp_function::dispatcher()
2024-06-11T18:30:21.66743628+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @           0x4fc697  (unknown)  cfunction_call
2024-06-11T18:30:21.667457034+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @ ... and at least 1 more frames
2024-06-11T18:30:21.667478061+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m Fatal Python error: Illegal instruction
2024-06-11T18:30:21.667499926+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m 
2024-06-11T18:30:21.667520369+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m Stack (most recent call first):
2024-06-11T18:30:21.667573281+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m   File "/usr/local/lib/python3.10/site-packages/spu/psi.py", line 118 in psi
2024-06-11T18:30:21.667596458+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m   File "/usr/local/lib/python3.10/site-packages/secretflow/device/device/spu.py", line 1379 in psi
2024-06-11T18:30:21.667616909+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m   File "/usr/local/lib/python3.10/site-packages/ray/util/tracing/tracing_helper.py", line 467 in _resume_span
2024-06-11T18:30:21.667636415+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m   File "/usr/local/lib/python3.10/site-packages/ray/_private/function_manager.py", line 726 in actor_method_executor
2024-06-11T18:30:21.667676736+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m   File "/usr/local/lib/python3.10/site-packages/ray/_private/worker.py", line 847 in main_loop
2024-06-11T18:30:21.667698507+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m   File "/usr/local/lib/python3.10/site-packages/ray/_private/workers/default_worker.py", line 282 in <module>
2024-06-11T18:30:21.667746345+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m 
2024-06-11T18:30:21.668545347+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m Extension modules: msgpack._cmsgpack, google._upb._message, psutil._psutil_linux, psutil._psutil_posix, setproctitle, yaml._yaml, charset_normalizer.md, requests.packages.charset_normalizer.md, requests.packages.chardet.md, ray._raylet, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, jaxlib.cpu_feature_guard, grpc._cython.cygrpc, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.tslib, pandas._libs.lib, pandas._libs.hashing, pyarrow.lib, pyarrow._hdfsio, pandas._libs.ops, pyarrow._compute, pandas._libs.arrays, pandas._libs.index, pandas._libs.join, pandas._libs.sparse, pandas._libs.reduction, pandas._libs.indexing, pandas._libs.internals, pandas._libs.writers, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.tslibs.strptime, pandas._libs.groupby, pandas._libs.testing, pandas._libs.parsers, pandas._libs.json, scipy._lib._ccallback_c, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.linalg._propack._spropack, scipy.sparse.linalg._propack._dpropack, scipy.sparse.linalg._propack._cpropack, scipy.sparse.linalg._propack._zpropack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, sklearn.__check_build._check_build, scipy.special._ufuncs_cxx, scipy.special._cdflib, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.stats.ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, scipy.stats.nct_ufunc, scipy.stats._boost.nct_ufunc, scipy.stats.skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, scipy.stats.invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy._lib._uarray._uarray, scipy.stats._ansari_swilk_statistics, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._mvn, scipy.stats._rcont.rcont, scipy.stats._unuran.unuran_wrapper, sklearn.utils._isfinite, sklearn.utils.murmurhash, sklearn.utils._openmp_helpers, sklearn.utils._logistic_sigmoid, sklearn.utils.sparsefuncs_fast, sklearn.preprocessing._csr_polynomial_expansion, sklearn.preprocessing._target_encoder_fast, pyarrow._json (total: 182)
2024-06-11T18:30:22.373640364+08:00 stderr F 2024-06-11 10:30:22.373 ERROR component.py:1129 [alice] -- [Anonymous_job] eval on domain: "data_prep"

gxcuit avatar Jun 13 '24 01:06 gxcuit

@gxcuit 可以执行下这个命令:cat /proc/cpuinfo | grep avx,看下cpu的信息

lq0404510 avatar Jun 13 '24 02:06 lq0404510

@gxcuit 可以执行下这个命令:cat /proc/cpuinfo | grep avx,看下cpu的信息

Hi, @lq0404510 Thanks for your reply.

我试过了,有avx,不过之前用FourQ,曾经报过 FourQ requires AVX2 instruction, 换别的曲线就可以了

cat /proc/cpuinfo | grep avx flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities

gxcuit avatar Jun 13 '24 05:06 gxcuit

您运行docker的服务器是通过windows创建的虚拟机进行的吗?如果是虚拟机进行运行的,先查看下物理机中是否存在avx avx2,如果有的话,将虚拟机配置中的cpu虚拟化打开,然后重启虚拟机。

lq0404510 avatar Jun 13 '24 07:06 lq0404510

您运行docker的服务器是通过windows创建的虚拟机进行的吗?如果是虚拟机进行运行的,先查看下物理机中是否存在avx avx2,如果有的话,将虚拟机配置中的cpu虚拟化打开,然后重启虚拟机。

Hi, docker 服务器是Linux 虚拟机, 跑在esxi上,Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz hyporvisor VMware ESXi, 6.7.0, 8169922 。

虚拟机中有avx,无avx2 。

我换一个物理机试一试

gxcuit avatar Jun 13 '24 07:06 gxcuit

您这边换过物理机以后,可以正常运行了吗?

lq0404510 avatar Jun 14 '24 07:06 lq0404510

您这边换过物理机以后,可以正常运行了吗?

抱歉还没来得及试,试过后在这反馈。Thanks

gxcuit avatar Jun 14 '24 10:06 gxcuit

您这边换过物理机以后,可以正常运行了吗?

Hi, @lq0404510

换过物理机后没问题了。但仍然比较奇怪,之前的机器是支持avx的

gxcuit avatar Jun 17 '24 05:06 gxcuit

感谢您的回复,您目前换的物理机的cpu信息可以发下吗?cat /proc/cpuinfo | grep avx,关于您的疑问,我们目前也在分析,如果后续有相关信息,会与您同步

lq0404510 avatar Jun 17 '24 08:06 lq0404510