Server-side virtual data processing
Hello. I was checking Unidata's news and stumble upon this article. It says that @matakleo implemented a server-side virtual data processing in TDS, as part of his summer intern project. Any idea when this feature will be available on a main release? I watched his project presentation and he demonstrated the EnhancementProvider using a classifier. Do you think it will also be possible to create a provider capable of getting two variables (u and v) and returning two more (magnitude and direction)?
I know uv to magdir it's a simple calculation, but I lost count of how many times we had some kind of problem where non metocan people (e.g. naval architects, subsea engineers) got u/v data directly from our inhouse TDS server and applied an incorrect transformation when converting to magdir. Thus, I believe a resource capable of offering server-side calculations directly for the APIs (opendap, ncss, wcs) would be very helpful.
Thank you and congratulations for the great work.
Hi @marceloandrioni, very cool that you are interested in using this! The way it currently works, it can apply a transformation to a single variable. So some extra work may be necessary before you could transform two variables into two others.
It is available in the current 5.6-SNAPSHOT which does require JDK 17 and some extra JVM args (see CHRONICLE_CACHE here). We are in the process of some security updates, after which we plan to make another release, and that would also contain this feature. It could be nice if you could start to test with the 5.6-SNAPSHOT, because then we can make adjustments to the EnhancementProvider if you run into any issues.
Hi @tdrwenski , sorry for the late reply. I am glad to know this option is already available in the snapshot. I will try to set the 5.6-snap + JDK17 on my side to run some tests and get back to you. Thank you.
Hi @marceloandrioni - I have this implemented now here: https://github.com/haileyajohnson/vectorize-thredds-plugin I'm not what the performance is like because I've only tested it on test data but it's a start at least!
also pinging @matakleo - if you wanna see your project in use :)
Thanks for this @haileyajohnson. I am out of the office at the moment, but I will try this as soon as possible, probably with some ERA5 wind data. I imagine an extra argument will be needed to indicate if the vector direction calculated from U/V should be "reversed" to indicate "coming from", like the wind and wave convetions. Thank you!
Hi everyone! I've been following this, and I think it's amazing to see it already in use. It's great to know that my contribution can benefit others. Long live TDS and netCDF!
Hello @haileyajohnson, I managed to get a TDS running with the following versions:
Linux 5.4.0-155-generic OpenJDK17U-jdk_x64_linux_hotspot_17.0.13_11 apache-tomcat-10.1.31 THREDDS Data Server 5.6 2024-10-16 (beta)
I ran "mvn package" for the vectorize plugin and moved the resulting vectorize-tds-plugin-1.0-SNAPSHOT.jar file to /usr/local/tds/tomcat/webapps/thredds##5.6/WEB-INF/lib.
Then I ran some tests using this netcdf with dims:
- time = 3 ;
- depth = 3 ;
- latitude = 121 ;
- longitude = 169 ;
In the thredds catalog.xml I added the following definitions:
<dataset name="cmems_uv_only"
ID="cmems_uv_only"
urlPath="datasets/cmems/cmems_uv_only"
dataType="Grid">
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
location="file:/usr/local/tds/datasets/cmems/cmems_forecast_20210101.nc"/>
</dataset>
<dataset name="cmems_uv_and_magdir"
ID="cmems_uv_and_magdir"
urlPath="datasets/cmems/cmems_uv_and_magdir"
dataType="Grid">
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
location="file:/usr/local/tds/datasets/cmems/cmems_forecast_20210101.nc">
<variable name="cspd" shape="time depth latitude longitude" type="float">
<attribute name="vectorize_mag" value="uo/vo" />
<attribute name="long_name" value="current speed" />
<attribute name="units" value="m/s" />
</variable>
<variable name="cdir" shape="time depth latitude longitude" type="float">
<attribute name="vectorize_dir" value="uo/vo" />
<attribute name="long_name" value="current direction" />
<attribute name="units" value="degrees" />
</variable>
</netcdf>
</dataset>
When I tried to access cmems_uv_and_magdir the first time I got some errors:
Throwable exception handled : jakarta.servlet.ServletException: Handler dispatch failed: java.lang.UnsupportedClassVersionError: org/example/VectorMagnitude$Provider has been compiled by a more recent version of the Java Runtime (class file version 63.0), this version of the Java Runtime only recognizes class file versions up to 61.0 (unable to load class [org.example.VectorMagnitude$Provider])
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1104)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:979)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1014)
at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:903)
I then went back to the vectorize plugin and replaced version 19 with 17 in the source and target in the pom.xml:
<properties>
<maven.compiler.source>17</maven.compiler.source>
<maven.compiler.target>17</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
and again moved the resulting vectorize-tds-plugin-1.0-SNAPSHOT.jar file to /usr/local/tds/tomcat/webapps/thredds##5.6/WEB-INF/lib.
With this new file the magnitude and direction appeared in the TDS interface:
But when I tried to get the magnitude and direction values everything showed as zero, despite valid values of u/v.
I am not sure if I missing some steps. It is enough to just put the plugin jar file in the WEB-inf/lib folder or do I need to also declare it as an ioServiceProvider in the threddsConfig.xml config file?
<!--
Configuring the CDM (netcdf-java library)
see https://www.unidata.ucar.edu/software/netcdf-java/reference/RuntimeLoading.html
<nj22Config>
<ioServiceProvider class="edu.univ.ny.stuff.FooFiles"/>
<coordSysBuilder convention="foo" class="test.Foo"/>
<coordTransBuilder name="atmos_ln_sigma_coordinates" type="vertical" class="my.stuff.atmosSigmaLog"/>
<typedDatasetFactory datatype="Point" class="gov.noaa.obscure.file.Flabulate"/>
</nj22Config>
-->
Thank you!
Cool! Thanks for trying it out! I think you need to put values in your magnitude and direction variables, they need to contain just the index of the corresponding u and v (so just 0 to u/v.length)
Hi @haileyajohnson. I am not sure I got this right. I included the values definition in the variables:
<dataset name="cmems_uv_and_magdir"
ID="cmems_uv_and_magdir"
urlPath="datasets/cmems/cmems_uv_and_magdir"
dataType="Grid">
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
location="file:/usr/local/tds/datasets/cmems/cmems_forecast_20210101.nc">
<variable name="cspd" shape="time depth latitude longitude" type="float">
<attribute name="vectorize_mag" value="uo/vo" />
<values start="0" increment="1" />
<attribute name="long_name" value="current speed" />
<attribute name="units" value="m/s" />
</variable>
<variable name="cdir" shape="time depth latitude longitude" type="float">
<attribute name="vectorize_dir" value="uo/vo" />
<values start="0" increment="1" />
<attribute name="long_name" value="current direction" />
<attribute name="units" value="degrees" />
</variable>
</netcdf>
</dataset>
But now, after downloading the file using NCSS, the largest value for magnitude and direction shows as 184040, that is, the size of my dataset (3 x 121 x 3 x 169)
hmm that looks like it's getting the min/max values from the un-converted values, which would be a bug.... do the values themselves look right?
A ncdump -v cspd cmems_uv_and_magdir.nc shows the values of the indexes instead of the magnitude.
The funny thing is that I also tried a direct access using xarray and then the values were all zero.
Doesn't look like it's working then haha. I'll take a look at it this afternoon, but we should maybe move this discussion to an issue on my repo and let unidata close this one.
No problem. Should I close this and open a new one on vectorize?
Hello. With the vectorize plugin it is now possible to calculate magnitude and direction directly on the server using u and v components. Would be possible to include this functionality on the main TDS codebase? I believe this option would be really useful to a lot of users. Also, the mag/dir calculation is working fine for data retrieved using NCSS and WMS, but when the data is accessed using the opendap protocol, everything is returned as zero.
Thank you!
I agree, this would be useful to have shipped with the TDS. I am working towards a release of the TDS this week - @haileyajohnson, how big of a lift would it be to get the code in the vectorize plugin repo into a PR?
Should be as easy as copy-pasting into the filters package. I have a few docs contributions sitting in a branch somewhere, I'll take a look at getting it all into a PR this evening.
It does not look like the TDS has its own filters package - is that right, @haileyajohnson? Since I just pushed out a netCDF-Java release, we won't be able to sneak it in there, but I could put it in the TDS, use the SPI to load it, and move it to netCDF-Java for the next release. If it is a simple copy/paste/change package statements job (plus the SPI resource file), I can take care of that.
Also, the mag/dir calculation is working fine for data retrieved using NCSS and WMS, but when the data is accessed using the opendap protocol, everything is returned as zero.
@marceloandrioni I seem to recall that we do not apply any netCDF-Java "enhancements" to datasets served via OPeNDAPa by default. We do some things in the code to map from the CDM to the OPeNDAP data model, but that's done outside of applying NetcdfDataset enhancements (like scale/offset, add coordinate systems, etc.). That said, you might be able to modify the NcML to do this - can you try adding the attribute enhance="all" to the netcdf element of the NcML? Something like:
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
enhance="all"
location="file:/usr/local/tds/datasets/cmems/cmems_forecast_20210101.nc">
Hi @lesserwhirls, I did a test with
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
enhance="all"
location="file:/usr/local/tds/datasets/cmems/cmems_forecast_20210101.nc">
The results are different, but not correct. Without the enhance="all", the server-side variables are returned as zero, with the enhance="all" the variables are returned as NaN.
Without enhance="all"
With enhance="all"
Thank you.
Ok, well, dang. It looks like maybe enhancements via NcML are being...weird...I will need to dig into that. Whatever the issue is, it is likely on the netCDF-Java side, so it won't make this release of the TDS (if it turns out to be something I can address in the TDS release, then we're in business and I'll get it in).
I've been digging around and so far what I have found is that when opening the dataset through the opendap service, the values explicitly set on the NcML defined variables cspd and cdir are not being read correctly. These values are being read as 0 or NaN (depending on the value of enhance attribute). The intended values (0, 1, 2, ...) are set in the VariableDS cache, but caching isn't enabled on the Variable that is ultimately read by the opendap service code, so the read call returns an empty array - it skips checking the cache and tries to read from the underlying netCDF file, which does not have these variables. Since those values (0, 1, 2, 3...) are intended to be used to index into u0 and v0 to compute the magnitude and direction, and we instead get 0 or NaN, the reads using those values for indexing into the uo and vo variables come back as 0 or NaN, and thus the calculations come back the same.
All of that said, I think there is a bug in the netCDF-Java side using the particular read path that the opendap server is using, so we'll need to track it down over there. Unfortunately that means this won't end up in the upcoming release of the TDS, but I think we'll get it in the SNAPSHOT pretty soon.
Hi @lesserwhirls , thank you for the detailed explanation and for looking into this issue. It’s not a problem at all having the bug fix in the snapshot version. I appreciate your efforts to track it down and address it on the netCDF-Java side. Let me know if there’s anything else I can help with! Maybe some testing after the TDS snapshot release.