geopyspark icon indicating copy to clipboard operation
geopyspark copied to clipboard

Re-add Support for Accumulo and HBase

Open jbouffard opened this issue 6 years ago • 1 comments

Overview

Even though it's claimed in the docs, GPS doesn't support Accumulo or HBase currently because we have removed those dependencies from the backend.

Background

Originally, the GPS backend was depended on both geotrellis-accumulo and geotrellis-hbase in order to provide support to their respective backends. However, at some point we removed those dependencies, as we thought they weren't actually needed in order to interact with the given backend. We now that this is not the case, as anyone trying to access Accumulo or HBase will receive the following error message:

Py4JJavaError: An error occurred while calling None.geopyspark.geotrellis.io.AttributeStoreWrapper.
: java.lang.RuntimeException: Unable to find AttributeStoreProvider for accumulo://user:password@zoo-keeper:2181/instance
	at geotrellis.spark.io.AttributeStore$$anonfun$apply$3.apply(AttributeStore.scala:102)
	at geotrellis.spark.io.AttributeStore$$anonfun$apply$3.apply(AttributeStore.scala:102)
	at scala.Option.getOrElse(Option.scala:121)
	at geotrellis.spark.io.AttributeStore$.apply(AttributeStore.scala:102)
	at geotrellis.spark.io.AttributeStore$.apply(AttributeStore.scala:106)
	at geopyspark.geotrellis.io.AttributeStoreWrapper.<init>(AttributeStoreWrapper.scala:25)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:238)
	at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
	at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:748)

Solutions

There are a few ways we can resolve this issue.

Solution 1: Re-Add the Dependencies

The most straightforward and easiest way to solve this problem would be to re-add the geotrellis-accumulo and geotrellis-hbase dependencies to the backend. While this is the easiest and most surefire way to get support back, it also creates a needlessly large fat jar which is cumbersome and will contain features users don't need/want. This should not be our first choice.

Solution 2: Add Additional Jars that Contain These Dependencies

Another solution would be to create a set of jars for the user to pick from. These jars will have different levels of support: Base GPS, Base GPS + Accumulo, Base GPS + HBase, and Base GPS + Accumulo + HBase. These jars could be downloaded via the GPS CLI:

geopyspark install base-jar // GPS with no Accumulo/HBase support
geopyspark install-jar // GPS + Accumulo/HBase support
geopyspark install-accumulo-jar // GPS + Accumulo support
geopyspark install-hbase-jar // GPS + HBase support

We can also add new make commands as well for building the jar:

make build-base // GPS with no Accumulo/HBase support
make build // GPS with Accumulo/HBase support
make build-base-with-accumulo // GPS with Accumulo support
make build-base-with-hbase // GPS with HBase support

The only issue with this solution would be maintaining the seperate jars. However, it may be worth the cost as the users gets to choose what they want in a straightforward way.

Other Solutions

The two above methods are just a few ways we can resolve this issue. We should take the time discuss other possible solutions and their pros/cons here.

jbouffard avatar Oct 18 '18 13:10 jbouffard

Hello, The current version of geopyspark is 0.4.3. This error will still occur after the implementation of geopyspark install-jar. Is there a better solution?

javyxu avatar Apr 12 '19 02:04 javyxu