accumulo icon indicating copy to clipboard operation
accumulo copied to clipboard

Refactor FATE

Open milleruntime opened this issue 3 years ago • 12 comments

I was looking at the warnings we suppress in ZooStore and they all seem to stem from the serialization of the generics. I was wondering if it would be better to refactor out the generic types so we don't have to arbitrarily serialize objects and blindly cast them. This would be a bit of work but it might not be too bad. I think if we create an interface like FateEnvironment to pass to the operations we wouldn't have to pass the Manager around.

milleruntime avatar Feb 08 '22 23:02 milleruntime

Maybe this should consider https://github.com/apache/accumulo/issues/1249

EdColeman avatar Feb 08 '22 23:02 EdColeman

If this was for 3.0, we could probably drop the old Bulk Import. That would help clean up a lot of code.

milleruntime avatar Feb 08 '22 23:02 milleruntime

One idea I had was to create an Environment interface to pass between FATE and the Manager. The tricky part is that some of the FATE code is in core (the interface, framework and some implementation) while most of it is in the same package as the Manger. I thought we could do this:

public interface RepoEnv {
  public ServerContext getContext();
}

But that won't work because ServerContext is in server and Fate is in core.

milleruntime avatar Feb 09 '22 00:02 milleruntime

On serialization, Java 17 has the new record keyword. It should be safer to serialize records than general Java objects, as we've done in the past. I have not yet used records, and we aren't using Java 17 yet... but could for 3.0 development.

ctubbsii avatar Feb 09 '22 01:02 ctubbsii

It should be safer to serialize records than general Java objects

Looking around for what we could do now to make things safer, found the following. Maybe we could use the new ObjectInputFilter added in Java 9 to explicitly limit what classes FATE can deserialize.

https://www.oracle.com/java/technologies/javase/seccodeguide.html#8 https://docs.oracle.com/javase/9/docs/api/java/io/ObjectInputFilter.html

keith-turner avatar Feb 09 '22 14:02 keith-turner

Another improvement that could be included in this change: Convert FateTxId to a stronger type to prevent coding errors like forgetting to call the method to display the value in HEX. The code would probably be 10x better if we had a FateTxId type that we used instead of long. Its toString method could consistently format for logging maybe. That could possible encapsulate or extend UUID, but I think we would benefit form a more specific type than UUID or long.

milleruntime avatar Feb 14 '22 19:02 milleruntime

For serialization of FATE repos, we could use JSON. There is already this in the FateCommand: https://github.com/apache/accumulo/blob/0fbf9cda0d1e16d67bf714970cb22dfa2907e50f/shell/src/main/java/org/apache/accumulo/shell/commands/fate/FateCommand.java#L85-L90

milleruntime avatar Feb 23 '22 19:02 milleruntime

@ctubbsii @keith-turner I have been experimenting with JSON serialization in FATE. There are already some classes to serialize the data for displaying with the Fate command. What do you guys think of trying to change the serialization to JSON in 2.1?

milleruntime avatar Feb 28 '22 14:02 milleruntime

@ctubbsii @keith-turner I have been experimenting with JSON serialization in FATE. There are already some classes to serialize the data for displaying with the Fate command. What do you guys think of trying to change the serialization to JSON in 2.1?

I think that would probably be too big and too disruptive for 2.1 as we're trying to wrap up major features for 2.1. But, I'm not opposed to the idea, generally. However, I still think it's worth looking into newer Java serialization for record objects in 3.0.

ctubbsii avatar Feb 28 '22 19:02 ctubbsii

@ctubbsii @keith-turner I have been experimenting with JSON serialization in FATE. There are already some classes to serialize the data for displaying with the Fate command. What do you guys think of trying to change the serialization to JSON in 2.1?

I think that would probably be too big and too disruptive for 2.1

After some experimentation, that was my feeling as well. The generics combined with the complexity of the different interfaces really make it difficult to make any small changes.

milleruntime avatar Feb 28 '22 19:02 milleruntime