xgboost
xgboost copied to clipboard
[JVM-packages] XGBoostModel training failed with Rabit returns with exit code 1
I encountered the following problems when using xgboost4j-spark 1.6.1 , spark 3.2.1 and scala 2.12,
Can someone help me out?
spark parameters:
--conf spark.sql.shuffle.partitions=200
--conf spark.executor.instances=8
--conf spark.driver.memory=4g \
xgboost parameters: "num_workers"->8, "nthread"->1
error: 2022-06-17 06:10:10 [task-result-getter-0] INFO [YarnClusterScheduler:57]: Removed TaskSet 2.0, whose tasks have all completed, from pool 2022-06-17 06:10:10 [dag-scheduler-event-loop] INFO [DAGScheduler:57]: ResultStage 2 (collect at XGBoost.scala:431) finished in 2871.246 s 2022-06-17 06:10:10 [dag-scheduler-event-loop] INFO [DAGScheduler:57]: Job 1 is finished. Cancelling potential speculative or zombie tasks for this job 2022-06-17 06:10:10 [dag-scheduler-event-loop] INFO [YarnClusterScheduler:57]: Killing all running tasks in stage 2: Stage finished 2022-06-17 06:10:10 [Driver] INFO [DAGScheduler:57]: Job 1 finished: collect at XGBoost.scala:431, took 3022.842636 s 2022-06-17 06:10:10 [Driver] INFO [RabitTracker:213]: Tracker Process ends with exit code 1 2022-06-17 06:10:10 [Driver] INFO [XGBoostSpark:433]: Rabit returns with exit code 1 2022-06-17 06:10:11 [Driver] ERROR [XGBoostSpark:455]: the job was aborted due to ml.dmlc.xgboost4j.java.XGBoostError: XGBoostModel training failed. at ml.dmlc.xgboost4j.scala.spark.XGBoost$.trainDistributed(XGBoost.scala:435) at ml.dmlc.xgboost4j.scala.spark.XGBoostRegressor.train(XGBoostRegressor.scala:190) at ml.dmlc.xgboost4j.scala.spark.XGBoostRegressor.train(XGBoostRegressor.scala:37) at org.apache.spark.ml.Predictor.fit(Predictor.scala:151) at SparkPi$.main(SparkPi.scala:52) at SparkPi.main(SparkPi.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:737) 2022-06-17 06:10:11 [Driver] ERROR [ApplicationMaster:94]: User class threw exception: ml.dmlc.xgboost4j.java.XGBoostError: XGBoostModel training failed. ml.dmlc.xgboost4j.java.XGBoostError: XGBoostModel training failed. at ml.dmlc.xgboost4j.scala.spark.XGBoost$.trainDistributed(XGBoost.scala:435) at ml.dmlc.xgboost4j.scala.spark.XGBoostRegressor.train(XGBoostRegressor.scala:190) at ml.dmlc.xgboost4j.scala.spark.XGBoostRegressor.train(XGBoostRegressor.scala:37) at org.apache.spark.ml.Predictor.fit(Predictor.scala:151) at SparkPi$.main(SparkPi.scala:52) at SparkPi.main(SparkPi.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:737) 2022-06-17 06:10:11 [Driver] INFO [ApplicationMaster:57]: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: ml.dmlc.xgboost4j.java.XGBoostError: XGBoostModel training failed. at ml.dmlc.xgboost4j.scala.spark.XGBoost$.trainDistributed(XGBoost.scala:435) at ml.dmlc.xgboost4j.scala.spark.XGBoostRegressor.train(XGBoostRegressor.scala:190) at ml.dmlc.xgboost4j.scala.spark.XGBoostRegressor.train(XGBoostRegressor.scala:37) at org.apache.spark.ml.Predictor.fit(Predictor.scala:151) at SparkPi$.main(SparkPi.scala:52) at SparkPi.main(SparkPi.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:737)
Hi, could you please post the log from executors?
train failed with local mode as well.
output log: [15:50:46] [690] train-rmse:0.00520688644065732
[15:50:46] [691] train-rmse:0.00520692697203341
[15:50:46] [692] train-rmse:0.00520696681770753
[15:50:46] [693] train-rmse:0.00520700622236835
[15:50:46] [694] train-rmse:0.00520704498998811
[15:50:46] [695] train-rmse:0.00520708323702370
[15:50:47] [696] train-rmse:0.00520712108582036
[15:50:47] [697] train-rmse:0.00520715829172192
[15:50:47] [698] train-rmse:0.00520719506255524
[15:50:47] [699] train-rmse:0.00520723134969795
22/06/17 15:50:47 ERROR XGBoostSpark: the job was aborted due to
ml.dmlc.xgboost4j.java.XGBoostError: XGBoostModel training failed.
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.trainDistributed(XGBoost.scala:435)
at ml.dmlc.xgboost4j.scala.spark.XGBoostRegressor.train(XGBoostRegressor.scala:190)
at ml.dmlc.xgboost4j.scala.spark.XGBoostRegressor.train(XGBoostRegressor.scala:37)
at org.apache.spark.ml.Predictor.fit(Predictor.scala:151)
at $line14.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.
executors log: 2022-06-17 14:04:32 [dispatcher-Executor] INFO [YarnCoarseGrainedExecutorBackend:57]: Got assigned task 92 2022-06-17 14:04:32 [Executor task launch worker for task 60.0 in stage 1.0 (TID 92)] INFO [Executor:57]: Running task 60.0 in stage 1.0 (TID 92) 2022-06-17 14:04:33 [Executor task launch worker for task 60.0 in stage 1.0 (TID 92)] INFO [FileScanRDD:57]: Reading File path: /train, range: 3976977000-4043259950, partition values: [empty row] 2022-06-17 14:04:53 [Executor task launch worker for task 60.0 in stage 1.0 (TID 92)] INFO [Executor:57]: Finished task 60.0 in stage 1.0 (TID 92). 1759 bytes result sent to driver 2022-06-17 14:04:54 [dispatcher-Executor] INFO [YarnCoarseGrainedExecutorBackend:57]: Got assigned task 102 2022-06-17 14:04:54 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [Executor:57]: Running task 6.0 in stage 2.0 (TID 102) 2022-06-17 14:04:54 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [MapOutputTrackerWorker:57]: Updating epoch to 1 and clearing cache 2022-06-17 14:04:54 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [TorrentBroadcast:57]: Started reading broadcast variable 4 with 1 pieces (estimated total size 4.0 MiB) 2022-06-17 14:04:54 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [TransportClientFactory:310]: Successfully created connection to after 1 ms (0 ms spent in bootstraps) 2022-06-17 14:04:54 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [MemoryStore:57]: Block broadcast_4_piece0 stored as bytes in memory (estimated size 5.2 KiB, free 2.8 GiB) 2022-06-17 14:04:54 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [TorrentBroadcast:57]: Reading broadcast variable 4 took 21 ms 2022-06-17 14:04:54 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [MemoryStore:57]: Block broadcast_4 stored as values in memory (estimated size 9.4 KiB, free 2.8 GiB) 2022-06-17 14:04:55 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [MemoryStore:57]: Block rdd_20_6 stored as values in memory (estimated size 4.8 MiB, free 2.8 GiB) 2022-06-17 14:04:56 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [MapOutputTrackerWorker:57]: Don't have map outputs for shuffle 0, fetching them 2022-06-17 14:04:56 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [MapOutputTrackerWorker:57]: Doing the fetch; tracker endpoint = NettyRpcEndpointRef(spark://MapOutputTracker@) 2022-06-17 14:04:56 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [MapOutputTrackerWorker:57]: Got the map output locations 2022-06-17 14:04:56 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [ShuffleBlockFetcherIterator:57]: Getting 64 (255.7 MiB) non-empty blocks including 8 (32.0 MiB) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 56 (223.7 MiB) remote blocks 2022-06-17 14:04:56 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [TransportClientFactory:310]: Successfully created connection to after 11 ms (0 ms spent in bootstraps) 2022-06-17 14:04:56 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [ShuffleBlockFetcherIterator:57]: Started 5 remote fetches in 117 ms 2022-06-17 14:05:14 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [TransportClientFactory:310]: Successfully created connection to after 3 ms (0 ms spent in bootstraps) 2022-06-17 14:05:26 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [TransportClientFactory:310]: Successfully created connection to after 2 ms (0 ms spent in bootstraps) 2022-06-17 14:56:12 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [Executor:57]: 1 block locks were not released by task 6.0 in stage 2.0 (TID 102) [rdd_20_6] 2022-06-17 14:56:12 [Executor task launch worker for task 6.0 in stage 2.0 (TID 102)] INFO [Executor:57]: Finished task 6.0 in stage 2.0 (TID 102). 1177 bytes result sent to driver 2022-06-17 15:08:23 [dispatcher-Executor] INFO [YarnCoarseGrainedExecutorBackend:57]: Driver commanded a shutdown 2022-06-17 15:08:24 [dispatcher-Executor] INFO [YarnCoarseGrainedExecutorBackend:57]: Driver from disconnected during shutdown 2022-06-17 15:08:24 [dispatcher-Executor] INFO [YarnCoarseGrainedExecutorBackend:57]: Driver from isconnected during shutdown 2022-06-17 15:08:24 [CoarseGrainedExecutorBackend-stop-executor] INFO [MemoryStore:57]: MemoryStore cleared 2022-06-17 15:08:24 [CoarseGrainedExecutorBackend-stop-executor] INFO [BlockManager:57]: BlockManager stopped 2022-06-17 15:08:24 [CoarseGrainedExecutorBackend-stop-executor] INFO [TSDBReporter:222]: stop sending 2022-06-17 15:08:24 [Thread-2] INFO [ShutdownHookManager:57]: Shutdown hook called
Simplified code: ` val xgbReg = new XGBoostRegressor(xgbParam). setFeaturesCol("features"). setLabelCol("label")
val xgbInput = spark.read.format("libsvm").load("adata")
val xgbModel = xgbReg.fit(xgbInput)
val results = xgbModel.transform(xgbInput)
xgbModel.nativeBooster.saveModel(fs_out)`
@trivialfis could you please help me out thist problem?
@wbo4958 Could you please take a look when you are available?
Sorry for late response, @Jasonzjj , Looks like driver sent the SHUTDOWN from the exeuctor log
Driver commanded a shutdown
2022-06-17 15:08:24 [dispatcher-Executor] INFO [YarnCoarseGrainedExecutorBackend:57]: Driver from disconnected during shutdown
Could you please double-check?
if the executor log is correct one, please fill the executor log with exceptions.
@wbo4958 Thanks for response, i found the executor log with exceptions like this.
2022-06-28 15:21:01 [Executor task launch worker for task 4.0 in stage 2.0 (TID 90)] INFO [Executor:57]: Running task 4.0 in stage 2.0 (TID 90) 2022-06-28 15:21:01 [Executor task launch worker for task 4.0 in stage 2.0 (TID 90)] INFO [MapOutputTrackerWorker:57]: Updating epoch to 1 and clearing cache 2022-06-28 15:21:01 [Executor task launch worker for task 4.0 in stage 2.0 (TID 90)] INFO [TorrentBroadcast:57]: Started reading broadcast variable 4 with 1 pieces (estimated total size 4.0 MiB) 2022-06-28 15:21:01 [Executor task launch worker for task 4.0 in stage 2.0 (TID 90)] INFO [MemoryStore:57]: Block broadcast_4_piece0 stored as bytes in memory (estimated size 5.2 KiB, free 2.8 GiB) 2022-06-28 15:21:01 [Executor task launch worker for task 4.0 in stage 2.0 (TID 90)] INFO [TorrentBroadcast:57]: Reading broadcast variable 4 took 39 ms 2022-06-28 15:21:01 [Executor task launch worker for task 4.0 in stage 2.0 (TID 90)] INFO [MemoryStore:57]: Block broadcast_4 stored as values in memory (estimated size 9.4 KiB, free 2.8 GiB) 2022-06-28 15:21:03 [Executor task launch worker for task 4.0 in stage 2.0 (TID 90)] INFO [MemoryStore:57]: Block rdd_20_4 stored as values in memory (estimated size 4.3 MiB, free 2.8 GiB) 2022-06-28 15:21:03 [Executor task launch worker for task 4.0 in stage 2.0 (TID 90)] INFO [MapOutputTrackerWorker:57]: Don't have map outputs for shuffle 0, fetching them 2022-06-28 15:21:03 [Executor task launch worker for task 4.0 in stage 2.0 (TID 90)] INFO [MapOutputTrackerWorker:57]: Doing the fetch; tracker endpoint = NettyRpcEndpointRef(spark://MapOutputTracker@) 2022-06-28 15:21:03 [Executor task launch worker for task 4.0 in stage 2.0 (TID 90)] INFO [MapOutputTrackerWorker:57]: Got the map output locations 2022-06-28 15:21:03 [Executor task launch worker for task 4.0 in stage 2.0 (TID 90)] INFO [ShuffleBlockFetcherIterator:57]: Getting 56 (246.5 MiB) non-empty blocks including 7 (30.8 MiB) local and 0 (0.0 B) host-local and 0 (0.0 B) push-merged-local and 49 (215.7 MiB) remote blocks 2022-06-28 15:21:04 [Executor task launch worker for task 4.0 in stage 2.0 (TID 90)] INFO [ShuffleBlockFetcherIterator:57]: Started 3 remote fetches in 197 ms 2022-06-28 16:14:01 [Executor task launch worker for task 4.0 in stage 2.0 (TID 90)] INFO [Executor:57]: 1 block locks were not released by task 4.0 in stage 2.0 (TID 90) [rdd_20_4] 2022-06-28 16:14:01 [Executor task launch worker for task 4.0 in stage 2.0 (TID 90)] INFO [Executor:57]: Finished task 4.0 in stage 2.0 (TID 90). 1177 bytes result sent to driver 2022-06-28 16:19:46 [dispatcher-Executor] INFO [YarnCoarseGrainedExecutorBackend:57]: Driver commanded a shutdown 2022-06-28 16:19:47 [dispatcher-Executor] INFO [YarnCoarseGrainedExecutorBackend:57]: Driver from disconnected during shutdown 2022-06-28 16:19:47 [dispatcher-Executor] INFO [YarnCoarseGrainedExecutorBackend:57]: Driver from disconnected during shutdown 2022-06-28 16:19:53 [netty-rpc-connection-1] INFO [TransportClientFactory:206]: Found inactive connection to , creating a new one. 2022-06-28 16:19:53 [executor-heartbeater] WARN [Executor:90]: Issue communicating with driver in heartbeater org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:103) at org.apache.spark.executor.Executor.reportHeartBeat(Executor.scala:1037) at org.apache.spark.executor.Executor.$anonfun$heartbeater$1(Executor.scala:212) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2048) at org.apache.spark.Heartbeater$$anon$1.run(Heartbeater.scala:46) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Failed to connect to xxx.com at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:288) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:218) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:230) at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:204) at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:202) at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:198) at java.util.concurrent.FutureTask.run(FutureTask.java:266) ... 3 more Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: xxx.com Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:707) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748) 2022-06-28 16:19:53 [CoarseGrainedExecutorBackend-stop-executor] WARN [TSDBSender:129]: encounter exception in TSDBSender: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:210) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:706) at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) at inf.TSDBSender$1.run(TSDBSender.java:89) at inf.TSDBSender$1.run(TSDBSender.java:80) at inf.TSDBSender$TSDBSenderAction.runWithCheck(TSDBSender.java:117) at inf.TSDBSender$TSDBSenderAction.runWithRetries(TSDBSender.java:125) at inf.TSDBSender.send(TSDBSender.java:98) at inf.TSDBReporter.sendToActualDataView(TSDBReporter.java:149) at inf.TSDBReporter.report(TSDBReporter.java:78) at com.codahale.metrics.ScheduledReporter.report(ScheduledReporter.java:253) at org.apache.spark.metrics.sink.TSDBSink.report(TSDBSink.scala:134) at org.apache.spark.metrics.MetricsSystem.$anonfun$report$1(MetricsSystem.scala:118) at org.apache.spark.metrics.MetricsSystem.$anonfun$report$1$adapted(MetricsSystem.scala:118) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at org.apache.spark.metrics.MetricsSystem.report(MetricsSystem.scala:118) at org.apache.spark.executor.Executor.stop(Executor.scala:320) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1$$anon$1.run(CoarseGrainedExecutorBackend.scala:213) 2022-06-28 16:19:53 [CoarseGrainedExecutorBackend-stop-executor] INFO [TSDBSender:131]: Retrying operation in TSDBSender. Retry no.1 2022-06-28 16:19:57 [CoarseGrainedExecutorBackend-stop-executor] INFO [MemoryStore:57]: MemoryStore cleared 2022-06-28 16:19:57 [CoarseGrainedExecutorBackend-stop-executor] INFO [BlockManager:57]: BlockManager stopped 2022-06-28 16:20:01 [CoarseGrainedExecutorBackend-stop-executor] INFO [TSDBReporter:222]: stop sending 2022-06-28 16:20:01 [Thread-2] INFO [ShutdownHookManager:57]: Shutdown hook called
last, did you solve the problem?
Got same error run on EMR, but it's ok run in local Mac
same error
same error:
sparkVersion = '3.2.1'
implementation('ml.dmlc:xgboost4j_2.12:1.6.2')
implementation('ml.dmlc:xgboost4j-spark_2.12:1.6.2')
implementation 'org.scala-lang:scala-library:2.12.15'
ml.dmlc.xgboost4j.java.XGBoostError: XGBoostModel training failed.
at ml.dmlc.xgboost4j.scala.spark.XGBoost$.trainDistributed(XGBoost.scala:435)
at ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier.train(XGBoostClassifier.scala:196)
at ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier.train(XGBoostClassifier.scala:35)
at org.apache.spark.ml.Predictor.fit(Predictor.scala:151)
at org.apache.spark.ml.Predictor.fit(Predictor.scala:115)
at org.apache.spark.ml.Pipeline.$anonfun$fit$5(Pipeline.scala:151)
at org.apache.spark.ml.MLEvents.withFitEvent(events.scala:130)
at org.apache.spark.ml.MLEvents.withFitEvent$(events.scala:123)
at org.apache.spark.ml.util.Instrumentation.withFitEvent(Instrumentation.scala:42)
at org.apache.spark.ml.Pipeline.$anonfun$fit$4(Pipeline.scala:151)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
at org.apache.spark.ml.Pipeline.$anonfun$fit$2(Pipeline.scala:147)
at org.apache.spark.ml.MLEvents.withFitEvent(events.scala:130)
at org.apache.spark.ml.MLEvents.withFitEvent$(events.scala:123)
at org.apache.spark.ml.util.Instrumentation.withFitEvent(Instrumentation.scala:42)
at org.apache.spark.ml.Pipeline.$anonfun$fit$1(Pipeline.scala:133)
at org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191)
at scala.util.Try$.apply(Try.scala:213)
at org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191)
at org.apache.spark.ml.Pipeline.fit(Pipeline.scala:133)
at org.apache.spark.ml.Pipeline.fit(Pipeline.scala:93)
at org.apache.spark.ml.Estimator.fit(Estimator.scala:59)
at org.apache.spark.ml.tuning.CrossValidator.$anonfun$fit$7(CrossValidator.scala:174)
at scala.runtime.java8.JFunction0$mcD$sp.apply(JFunction0$mcD$sp.java:23)
at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
at scala.util.Success.$anonfun$map$1(Try.scala:255)
at scala.util.Success.map(Try.scala:213)
at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
at org.apache.spark.util.ThreadUtils$$anon$1.execute(ThreadUtils.scala:99)
at scala.concurrent.impl.ExecutionContextImpl$$anon$4.execute(ExecutionContextImpl.scala:138)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:72)
at scala.concurrent.impl.Promise$KeptPromise$Kept.onComplete(Promise.scala:372)
at scala.concurrent.impl.Promise$KeptPromise$Kept.onComplete$(Promise.scala:371)
at scala.concurrent.impl.Promise$KeptPromise$Successful.onComplete(Promise.scala:379)
at scala.concurrent.impl.Promise.transform(Promise.scala:33)
at scala.concurrent.impl.Promise.transform$(Promise.scala:31)
at scala.concurrent.impl.Promise$KeptPromise$Successful.transform(Promise.scala:379)
at scala.concurrent.Future.map(Future.scala:292)
at scala.concurrent.Future.map$(Future.scala:292)
at scala.concurrent.impl.Promise$KeptPromise$Successful.map(Promise.scala:379)
at scala.concurrent.Future$.apply(Future.scala:659)
at org.apache.spark.ml.tuning.CrossValidator.$anonfun$fit$6(CrossValidator.scala:182)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at scala.collection.TraversableLike.map(TraversableLike.scala:286)
at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198)
at org.apache.spark.ml.tuning.CrossValidator.$anonfun$fit$4(CrossValidator.scala:172)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at scala.collection.TraversableLike.map(TraversableLike.scala:286)
at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198)
at org.apache.spark.ml.tuning.CrossValidator.$anonfun$fit$1(CrossValidator.scala:166)
at org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191)
at scala.util.Try$.apply(Try.scala:213)
at org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191)
at org.apache.spark.ml.tuning.CrossValidator.fit(CrossValidator.scala:137)
Could you help to check the python tracker log? Python tracker requires python 3.8+