fury icon indicating copy to clipboard operation
fury copied to clipboard

[Java] remove guava dependency

Open chaokunyang opened this issue 1 year ago • 12 comments

Is your feature request related to a problem? Please describe.

Fury use guava for generic reflection, cache. But guava is used so widely, it's very likely to version conflict with users code.

Describe the solution you'd like

Remove guava dependency, implement all releated methods by fury self.

Here are the things which may need to be finished to make guava optional:

  • Implement a TypeRef to replace TypeToken in guava, which is work that @nandakumar131 doing before.
  • Remove guava Cache which Fury used for weak key and soft values map
  • Remove guava FinalizableWeakReference which fury used to build a multi-key weak map
  • Remove guava ListeningExecutorService which fury used to compile classes and register callback
  • Remove guava com.google.common.collect usage in the src/main/java dir except org.apache.fury.serializer.collection.GuavaCollectionSerializers
  • Make org.apache.fury.serializer.collection.GuavaCollectionSerializers load stub classes, so even guava jar are not in classpath, teh fury init won't throw exception, like we did in org.apache.fury.serializer.collection.ImmutableCollectionSerializers
  • Add a script to ci to check com.google. are only used in GuavaCollectionSerializers and test classes.

Additional context

Shade guava is an option too, but introduce complexities for fury GuavaSerializers and guava CVE

chaokunyang avatar Nov 20 '23 06:11 chaokunyang

@chaokunyang If you haven't started working on the guava removal part 2, I can take it up and post a PR.

nandakumar131 avatar Dec 22 '23 16:12 nandakumar131

Hi @nandakumar131, thanks for the willingness to contribute to Fury. I haven't start part2 yet, you can take up this part work.

FYI, Currently Fury uses guava TypeToken for generics inspection and use guava cache for some static data cache. These are the biggest dependency on guava

chaokunyang avatar Dec 22 '23 16:12 chaokunyang

Thanks @chaokunyang! If the change becomes huge, I will split it into two PRs to make it easy to review.

nandakumar131 avatar Dec 22 '23 16:12 nandakumar131

@chaokunyang do we still want to support GuavaCollectionSerializers?

nandakumar131 avatar Dec 22 '23 18:12 nandakumar131

@chaokunyang do we still want to support GuavaCollectionSerializers?

I prefer to still support it, but may use a way without source api dependency. For example, we use methodhandle to get all methods for invocation

chaokunyang avatar Dec 22 '23 21:12 chaokunyang

This also helps in reducing JAR file size. Let me know if I can help you here.

Munoon avatar Apr 05 '24 11:04 Munoon

This also helps in reducing JAR file size. Let me know if I can help you here.

@Munoon That would be really great. Actually I am on the work of first release of Fury under asf and don't have enough time for this feature

chaokunyang avatar Apr 05 '24 13:04 chaokunyang

@Munoon That would be really great. Actually I am on the work of first release of Fury under asf and don't have enough time for this feature

Can you please specify which exactly help you need? To complete @nandakumar131 PR or to do some further steps?

Munoon avatar Apr 05 '24 13:04 Munoon

@Munoon That would be really great. Actually I am on the work of first release of Fury under asf and don't have enough time for this feature

Can you please specify which exactly help you need? To complete @nandakumar131 PR or to do some further steps?

Here are the things which may need to be finished to make guava optional:

  • Implement a TypeRef to replace TypeToken in guava, which is work that @nandakumar131 doing before.
  • Remove guava Cache which Fury used for weak key and soft values map
  • Remove guava FinalizableWeakReference which fury used to build a multi-key weak map
  • Remove guava ListeningExecutorService which fury used to compile classes and register callback
  • Remove guava com.google.common.collect usage in the src/main/java dir except org.apache.fury.serializer.collection.GuavaCollectionSerializers
  • Make org.apache.fury.serializer.collection.GuavaCollectionSerializers load stub classes, so even guava jar are not in classpath, teh fury init won't throw exception, like we did in org.apache.fury.serializer.collection.ImmutableCollectionSerializers
  • Add a script to ci to check com.google. are only used in GuavaCollectionSerializers and test classes.

chaokunyang avatar Apr 06 '24 07:04 chaokunyang

OK, I'll separate those by different PRs starting from TypeRef implementation, which looks like the most hard one, so it'll take some time...

Munoon avatar Apr 06 '24 09:04 Munoon

OK, I'll separate those by different PRs starting from TypeRef implementation, which looks like the most hard one, so it'll take some time...

Yep, the TypeRef is the most compilciated one, and the TypeToken in guava has some performance issue, so we did some optimization in org.apache.fury.type.TypeUtils. If we have our own implementation, such helper can be simplified a lot.

chaokunyang avatar Apr 06 '24 10:04 chaokunyang

@chaokunyang, sorry about the delay on #1273. I will not be able to make any progress on the PR in next two weeks, I can take it up after that.

@Munoon, feel free to take over #1273, if you have started working on TypeToken changes.

nandakumar131 avatar Apr 16 '24 02:04 nandakumar131