parcels icon indicating copy to clipboard operation
parcels copied to clipboard

Refactor ParticleData internals to use a xr.Dataset

Open VeckoTheGecko opened this issue 1 year ago • 2 comments

Conceptually the internals of ParticleData (i.e., dictionary of 1D arrays, which all needs ot be the same length) is very similar to that of a dataframe. Migrating this across would be a boost to maintainability, and likely also performance.

Pandas is already a dependency of xarray, hence there isn't cost to adding it as an explicit dependency.

Although looking at the code ParticleData looks to be a container for 1D arrays, conceptually that doesn't make much sense (particles x time = 2D arrays). Maybe its for single particles? Maybe for a single snapshot? Still working out the responsibility of ParticleData and how that interfaces with the rest of ParticleSet...

VeckoTheGecko avatar Jan 10 '25 11:01 VeckoTheGecko

@erikvansebille this is the tracking issue for the particledata class implementation as an xarray Dataset (as I've grown familiar with the codebase - I think an xarray dataset is the one to go for in this case, not a pandas dataframe).

Feel free to self assign if you want.

VeckoTheGecko avatar Jul 01 '25 13:07 VeckoTheGecko

Thanks, I've assigned myself. I'll probably wait another week until the fieldset dev in v4 is a bit more robust before I start, so that I can easier test. But look forward to working on this!

erikvansebille avatar Jul 02 '25 06:07 erikvansebille