Oxen
Oxen copied to clipboard
Custom datasets
Hi there,
I just learned about Oxen, and it looks very promising. I have a bit of an unusual use case and I'm trying to figure out whether Oxen could be the appropriate solution here.
I deal with planetary ephemeris files which store, in binary format, the trajectory of planets and spacecraft for possibly hundreds of years. This format was originally created by JPL in the 80s -- the specs are here: https://naif.jpl.nasa.gov/pub/naif/toolkit_docs/C/req/daf.html . The main library that reads these files is by NASA itself (through the NAIF division) and is called SPICE ... but I've rewritten it in full in Rust (because the original code is FORTRAN transliterated in C and absolutely not thread safe). This rewrite is called ANISE -- https://github.com/nyx-space/anise.
Every year, NASA releases a new and improved prediction of where the planets will be in the future -- https://naif.jpl.nasa.gov/pub/naif/generic_kernels/spk/planets/. Every day, NASA also releases the Earth orientation parameters, which specify how the Earth is actually aligned with respect to the stars (we can't predict it super well crazy enough) -- https://naif.jpl.nasa.gov/pub/naif/generic_kernels/pck/ (specifically the earth_latest_high_prec.bpc file). These files aren't typically big (ranging from single digit MB to typically ~100s MB).
In spacecraft operations, we need to ensure that the whole team of flight dynamics engineer use either the latest data (for some computations), or a specific agreed-upon version of these data. The way I've solved this in ANISE is by having a "MetaFile" structure which is pretty simple and stores the URL to the file and optionally its CRC32, so that it can be redownloaded if the CRC is unspecified or if the CRC does not match the local copy (config file example: https://github.com/nyx-space/anise/blob/master/data/latest.dhall ; basic docs: https://docs.rs/anise/latest/anise/almanac/metaload/struct.MetaFile.html ). Another related use case is that we need to publish new datasets, namely a new ephemeris file whenever we compute a new trajectory. In this case, we have consumers of this data in other teams who need to be sure that they're using the latest version of all our data prior to whatever work they're up to. At the moment, we use Kedro to organize our workflows but also to version our data on AWS S3. It works perfectly fine, but Oxen's visualization of datasets is very appealing (especially as most people in this industry are not tech savvy).
In other words, versioning of datasets is crucial. Oxen solves the versioning of datasets and provides visualizations. But Oxen doesn't support NASA's DAF format (and it's such a niche case that it probably should not). Hence my questions:
- Is it possible to have extensions to Oxen so that I could upload and visualize in the Oxen web UI the difference between two ephemeris files (even better if I could plot specific things with them, but that's probably a huge stretch)?
- ~Is it possible to upload arbitrary blobs of data on Oxen?~ (This already works!) If so, I could use ANISE to build a "companion" delivery for new ephemeris files that we deliver and users could diff the companion version. Interestingly, Oxen tries to parse this as text.
- ~Is it possible to download these datasets with a generic URLs that include the version, e.g. similar to how AWS S3 can have a unique link? In my experience, it's hard enough to convince the IT teams in my industry to install updates to Python, so I can't imagine the trials and tribulations to convince them to install a new binary on operational machines, but if all that's needed is curl/wget with a token, that would be fine (especially if it runs a local deployment of Oxen (which you could/should charge for in my view)).~ (This already works too!)
- Would it be possible to separate the Oxen crate into a workspace so that the CLI and lib don't depend on actix since that's a server req? (I'd be more than happy to work on that myself).
That's all my questions for now, and again, this is a very exciting project, so I'll be keeping a close eye on it regardless.
Thanks