sbt-databricks icon indicating copy to clipboard operation
sbt-databricks copied to clipboard

dbcUpload/Deploy fails with NoSuchElementException: cannot find node with id <id>

Open justinmills opened this issue 7 years ago • 5 comments

I think this is the same issue reported in #35, but I may have narrowed down when it happens.

If I attach a jar to a cluster and then delete that jar the cluster still has the jar attached, but it's in a "deleted pending restart" state. If you attempt to upload or deploy the jar you get this following error:

org.apache.http.client.HttpResponseException: NoSuchElementException: cannot find node with id 377539161864868
	at sbtdatabricks.DatabricksHttp.handleResponse(DatabricksHttp.scala:80)
	at sbtdatabricks.DatabricksHttp.fetchLibraries(DatabricksHttp.scala:132)
	at sbtdatabricks.DatabricksPlugin$$anonfun$dbcFetchLibraries$1.apply(DatabricksPlugin.scala:74)
	at sbtdatabricks.DatabricksPlugin$$anonfun$dbcFetchLibraries$1.apply(DatabricksPlugin.scala:73)
	at scala.Function1$$anonfun$compose$1.apply(Function1.scala:47)
	at sbt.$tilde$greater$$anonfun$$u2219$1.apply(TypeFunctions.scala:40)
	at sbt.std.Transform$$anon$4.work(System.scala:63)
	at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:226)
	at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:226)
	at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:17)
	at sbt.Execute.work(Execute.scala:235)
	at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:226)
	at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:226)
	at sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestrictions.scala:159)
	at sbt.CompletionService$$anon$2.call(CompletionService.scala:28)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

It can be fixed by restarting the cluster. I suspect what's going on is that the plugin is trying to find an existing version of the jar, it does, but it's marked as deleted and only exists because there's still a cluster with the jar loaded. Once the last cluster using the jar is restarted, the jar is removed and the plugin no longer finds an existing copy of the jar.

justinmills avatar May 01 '17 13:05 justinmills

when we restart a cluster and then deploy again, we get the following error -

sbt dbcDeploy ... [info] Cluster found. Starting deploy process... Deleting older version of test_2.11-1.0.5-SNAPSHOT.jar Uploading test_2.11-1.0.5-SNAPSHOT.jar org.apache.http.client.HttpResponseException: Exception: The directory already contains an element called 'test_2.11-1.0.5-SNAPSHOT.jar' at sbtdatabricks.DatabricksHttp.handleResponse(DatabricksHttp.scala:80) at sbtdatabricks.DatabricksHttp.uploadJar(DatabricksHttp.scala:108) at sbtdatabricks.DatabricksPlugin$$anonfun$8.apply(DatabricksPlugin.scala:151) at sbtdatabricks.DatabricksPlugin$$anonfun$8.apply(DatabricksPlugin.scala:150) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.immutable.HashSet$HashSet1.foreach(HashSet.scala:153) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.AbstractSet.scala$collection$SetLike$$super$map(Set.scala:47) at scala.collection.SetLike$class.map(SetLike.scala:93) at scala.collection.AbstractSet.map(Set.scala:47) at sbtdatabricks.DatabricksPlugin$.sbtdatabricks$DatabricksPlugin$$uploadImpl1(DatabricksPlugin.scala:150) at sbtdatabricks.DatabricksPlugin$$anonfun$deployImpl$1$$anonfun$apply$6$$anonfun$apply$7.apply(DatabricksPlugin.scala:194) at sbtdatabricks.DatabricksPlugin$$anonfun$deployImpl$1$$anonfun$apply$6$$anonfun$apply$7.apply(DatabricksPlugin.scala:192) at scala.Function1$$anonfun$compose$1.apply(Function1.scala:47) at sbt.$tilde$greater$$anonfun$$u2219$1.apply(TypeFunctions.scala:40) at sbt.std.Transform$$anon$4.work(System.scala:63) at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:228) at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:228) at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:17) at sbt.Execute.work(Execute.scala:237) at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:228) at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:228) at sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestrictions.scala:159) at sbt.CompletionService$$anon$2.call(CompletionService.scala:28) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) [error] (*:dbcDeploy) org.apache.http.client.HttpResponseException: Exception: The directory already contains an element called 'test_2.11-1.0.5-SNAPSHOT.jar' [error] Total time: 15 s, completed May 1, 2017 4:50:02 PM

arushijain avatar May 02 '17 00:05 arushijain

@arushijain Does your dbcLibraryPath end with a / by any chance? If so, could you try without it?

brkyvz avatar May 02 '17 00:05 brkyvz

@brkyvz no it doesn't. It is just /Shared/Libraries

arushijain avatar May 02 '17 00:05 arushijain

@arushijain I have not seen that error, even after restarting the cluster. Did you get that error after getting the error I mentioned above?

Also, our dbcLibraryPath does not have a trailing /.

justinmills avatar May 02 '17 12:05 justinmills

@justinmills So when I deploy to a cluster that already has a jar attached to it, it fails on deleting the previous jar. When I go and check which libraries are used in dbc, I see the following.

screen_shot_2017-05-02_at_12 48 13_pm

Then, I restart the cluster and it has then successfully deleted the jar at which point I can deploy again. Highly unsustainable.

arushijain avatar May 02 '17 19:05 arushijain