Scan classpath for EDD subclasses/dataset types
Description
This change uses reflection, specifically the ClassGraph classpath scanner, to discover concrete EDD subclasses and associated class metadata (fromXml methods and Sax handlers for XML deseralization) at runtime. The scan is performed once and stored in a static map for use in XML deserialization methods.
In addition to eliminating the need to register various EDD implementations in the ERDDAP codebase in EDD.fromXml and HandlerFactory.getHandlerFor, this change also allows loading of third party EDD implementations at runtime.
The reflection based EDD class discovery is enabled by the ERDDAP setting useEddReflection. If true, the class map built using reflection is used to discover classes/methods for XML deserialization. If false, legacy hardcoded approaches are used.
In local testing classpath scanning took about 1 second. This scan occurs only once at startup. Scanning is confined to EDD's package (gov.noaa.pfel.erddap.dataset) for performance reasons; scanning all packages took around 12 seconds. For this reason, all EDD implementations need to belong to the gov.noaa.pfel.erddap.dataset package to be detected.
This adds a dependency on classgraph, which is very small (~560k).
Type of change
- [x] New feature (non-breaking change which adds functionality)
Checklist before requesting a review
- [x] I have performed a self-review of my code
- [x] My code follows the style guidelines of this project
- [x] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes
Need to do some testing tomorrow to verify that third-party EDD class discovery actually works, but here's the code for review.
Ah, I missed the SaxHandler entirely. Looking at that now...
Could we use an annotation instead of looking for a static method? I think that would simplify the check for in the reflection logic. It would also give an indication in the dataset file that the class is referenced by reflection (it can be difficult to differentiate dead code and code only referenced by reflection and I'd like to have some indication the code is not dead).
Also I have a minor refactoring of the sax handlers in this pull that you might want to take a look at: https://github.com/ERDDAP/erddap/pull/205
Before seeing your message I've been working on using an annotation on the EDD classes to indicate their Sax handler, so that's good confirmation. Are you saying we should use annotations for the non-sax parser fromXml approach as well? Basically have a method level @FromXml annotation that gets set on each EDD class' appropriate fromXml method?
Yeah, I'd prefer to indicate code that's only getting called through reflection with an annotation to make that explicit in the code.
Ok, I reimplemented this, adding support for both fromXml and State sax parser discovery using reflections/annotations.
Also, I think it should be fairly straightforward to adapt this to #205, maybe just changing the usage of Class<State> to Class<BaseDatasetHandler> for clarity.
We should probably file an issue to update the GenerateDatasetsXml.java (https://github.com/ERDDAP/erddap/blob/main/WEB-INF/classes/gov/noaa/pfel/erddap/GenerateDatasetsXml.java#L187) with a similar reflection approach. I think that will be quite a bit more complicated due to the varying parameters though. Unless you wanted to tackle that here?
Good call, let's handle GenerateDatasetsXml changes in a separate issue.
@ChrisJohnNOAA One thing I want to check on. With the next release are we going to set a hard requirement on Java 21+? Or continue to support both 17 and 21?
This page strongly suggests using Java 21 but mentions that both 17 and 21 are tested and supported.
https://github.com/ERDDAP/erddap/blob/main/DEPLOY_INSTALL.md
On the changes page there's no mention of Java 21. Java 17 is currently noted as being the required version there.
https://erddap.github.io/changes.html
I ask because some of the class casting used in the new reflections code works in 21 and fails in 17. If we want to continue to support 17 I think I can make the generics unspecific.
This seems to be working well. I made the reflection based discovery configurable with a useEddDiscovery setting, and the Java 17 issues mentioned above seem to be resolved.
@ChrisJohnNOAA One thing I want to check on. With the next release are we going to set a hard requirement on Java 21+? Or continue to support both 17 and 21?
This page strongly suggests using Java 21 but mentions that both 17 and 21 are tested and supported.
https://github.com/ERDDAP/erddap/blob/main/DEPLOY_INSTALL.md
On the changes page there's no mention of Java 21. Java 17 is currently noted as being the required version there.
https://erddap.github.io/changes.html
I ask because some of the class casting used in the new reflections code works in 21 and fails in 17. If we want to continue to support 17 I think I can make the generics unspecific.
I believe last release I upgraded to 21 shortly before the release was done so I had done a lot of testing with both 17 and 21. It'd be good if ERDDAP can continue to work on 17, however in my opinion the most important thing is to be compatible with the latest LTS version (currently 21 which has been available for about a year now).