lakeFS
lakeFS copied to clipboard
LakeFS Java SDK compatibility with Spark
LakeFS Java SDK depends on okhttp 4.x. The apache spark docker image ships with okhttp 3.x. I've specifically seen this with the spark 3.5.2 image, using lakefs SDK 1.48.0
In this environment, branch creation calls fail with the following error:
Exception in thread "main" java.lang.NoSuchMethodError: 'okhttp3.RequestBody okhttp3.RequestBody.create(java.lang.String, okhttp3.MediaType)'
at io.lakefs.clients.sdk.ApiClient.serialize(ApiClient.java:955)
at io.lakefs.clients.sdk.ApiClient.buildRequest(ApiClient.java:1211)
at io.lakefs.clients.sdk.ApiClient.buildCall(ApiClient.java:1160)
at io.lakefs.clients.sdk.BranchesApi.createBranchCall(BranchesApi.java:332)
at io.lakefs.clients.sdk.BranchesApi.createBranchValidateBeforeCall(BranchesApi.java:347)
at io.lakefs.clients.sdk.BranchesApi.createBranchWithHttpInfo(BranchesApi.java:353)
at io.lakefs.clients.sdk.BranchesApi.access$500(BranchesApi.java:46)
at io.lakefs.clients.sdk.BranchesApi$APIcreateBranchRequest.execute(BranchesApi.java:415)
The lakefs Hadoop fs addresses this problem by shading the okhttp dependency. The SDK should take the same path to avoid this incompatibility.
Hi @peter-mcclonski , Sorry to hear your having difficulty with Spark dependencies. As you probably know, this is a common issue with Spark. As you point out, generally users of Spark have to shade dependencies when building their libraries.
IIUC you would like an assembled lakefs-sdk package with all its ok* dependencies shaded? Would instructions how to do this in your pom.xml suffice? I ask because shading is sometimes nuanced, with different users requiring different shading. For an SDK in particular, an assembled and shaded version will cause difficulties if any internal object is ever moved across library boundaries. So somewhat paradoxically obstructions how to do this yourself, while more hassle, can lead to simpler ops down the line.
Hi there, I'd be interested in seeing these instructions. Thanks!