fluo
fluo copied to clipboard
Avoid dependency hell
The two scenarios below really bother me.
- User writes an observer that has conflicting dependencies with the Fluo worker thats running the observer.
- User writes two observers that have conflicting dependencies with each other. For example one Observer uses LIB A which depends on guava 18 and another Observer uses LIB C which depends on guava 13.
I am thinking it would be nice isolate Observer depenencies from each other and from the Fluo worker framework. OSGI may be a a way to do this, but I am not sure because I know very little about OSGI. I am going to explore using Apache Felix in the worker to accomplish this dependency isolation. Any other suggestions/solutions would be welcome.
A "dumb" solution would be to encourage a shaded jar with relocated classes. Very inelegant when compared to a solution like OSGI or felix and requires more work to build observers to run in Fluo, but it should work w/o any additional dependencies.
A "dumb" solution would be to encourage a shaded jar with relocated classes.
Thats a very interesting solution. I have never used that before. Thanks for pointing that out, that may be a good way to go.
Thanks for pointing that out, that may be a good way to go.
If nothing else, it's not bad to isolate the Fluo dependencies that way. Forcing users to do it themselves is a little heinous, IMO. Hopefully, you can rely on a group of people using a Fluo instance to have some consistency among themselves. You just focus the effort on preventing your own dependencies from causing problems or similar. I'm not sure how to correctly handle that in imports either. I'm sure google knows if you ask it.
If nothing else, it's not bad to isolate the Fluo dependencies that way.
I suppose that would require building a fat jar for the Fluo worker with relocated impl classes. I am going to research this option to understand it better.
Forcing users to do it themselves is a little heinous, IMO.
How does that compare to forcing user to use OSGI??? I am not familiar with either, so I am not sure which is more burdensome. One possibility I was thinking about is that fluo takes care of it with isolation based on a certain directory structure (ie. app/observer1/lib/... and app/observer2/lib/..). However I am not in favor of this option and would rather leverage standards if possible after my experiences with accumulo start.
How does that compare to forcing user to use OSGI??? I am not familiar with either, so I am not sure which is more burdensome
I'm not sure either. I just know that people often scoff at learning how to do something in Maven beyond "build a jar" or "build an assembly" :)
While working on a generic export recipe for Fluo, I thought of following use case that would be nice to support :
- Fluo instance is using an Accumulo 1.8.0 instance.
- A Fluo observer is exporting to a table in a Accumulo 1.6.4 instance.
In this case the Fluo framework needs the Accumulo 1.8.0 client libs and deps, however the observer needs the Accumulo 1.6.4 client libs and deps.
The shaded jar was finished, is this still relevant?
The shaded jar was finished, is this still relevant?
Yeah, its still relevant. The shading work that was was only for thrift. I think we should make it easy to build a shaded version of Fluo from source. I don't think the project should release a shaded version of Fluo to Maven central. We should just document how a user could do this if they are having problems with dependencies. We can also test the shaded version of Fluo and ensure it works.
One possible way to do this is to add a maven profile and document how to use it. The user facing docs would look something like the following.
- Obtain source tar ball from ....
- tar xf ....
- cd fluo-x.y.z
- mvn package -PshadeAll -Daccumulo.version=$AV -Dhadoop.version=$HV -Dzookeeper.version=$ZV
- cp <path to shaded tar> <your des>
We could also look into making the Docker build file use this.