intake-esm icon indicating copy to clipboard operation
intake-esm copied to clipboard

Add write support for parquet

Open charles-turner-1 opened this issue 4 months ago • 4 comments

Change Summary

  • In #728, I added support for reading .parquet files, but completely forgot to add support for serialising catalogs back to them. This PR fixes that.
  • Cleans up an issue with available compression associated with that PR.
  • Tests we can deserialise and reserialise (from {.csv,.parquet} => {.csv, .parquet}).
  • Adds write_kwargs, a la read_kwargs in #728, plus soft deprecation of the to_csv_kwargs keywods argument when serialising.

N.B: Deserialisation & reserialisation tests round trip twice, it looks like the defaults for how empty catalog field options (in catalog.json) have changed since that catalog was created.

Checklist

  • [x] Unit tests for the changes exist
  • [x] Tests pass on CI
  • [ ] Documentation reflects the changes where applicable
    • [ ] Will need to update ecgtools first to allow for quickstart guide update.
    • [x] Relevant API reference bits updated.

I will no longer make pull requests when jetlagged.

charles-turner-1 avatar Sep 01 '25 10:09 charles-turner-1