polars
polars copied to clipboard
Expose `date_format`, `time_format`, and `datetime_format` parameters to `DataFrame.write_csv`
I notice that CsvWriter has a field of type SerializeOptions. SerializeOptions in turn supports
date_formattime_formatdatetime_format
I'd like to expose these parameters on the Python side in DataFrame.write_csv. We could additionally add support for float_format as requested in #4279
@ritchie46 any objections?
@ritchie46 any objections?
Nope, that would be great!
I'd like to expose these parameters on the Python side in
DataFrame.write_csv. We could additionally add support forfloat_formatas requested in #4279
Spooky, I was thinking about doing something very similar, with some small additions:
Default datetime_format (on the Rust side) should really be per-timeunit (all datetimes currently output as ns):
-
ns >> "%S.%9f" -
us >> "%S.%6f" -
ms >> "%S.%3f"Then the python-side
datetime_formatparam could EITHER be a single string (in which case all datetime cols get that format, regardless of timeunit) OR a{timeunit:format}dict (so you could override each individually if you want).
Also:
- Option of a custom string for
null_value(defaulting to the empty string, as it is now). - Option of a custom
empty_string(defaulting to two double-quotes, as it is now).
@matteosantama: I can look at doing this after you've made a patch, or would you like to incorporate some of it?
@alexander-beedie I think that's an awesome idea.
Sounds like there's a few things we want to do, which should each have its own MR
- [ ] Expose pre-existing
datetime_format,date_format, andtime_formatfunctionality on the Python side #4364 - [ ] Enable more sophisticated
datetime_formatspecifications - [ ] Create a new
float_formatparameter - [ ] Options for
null_valueandempty_stringoutput.
I've just opened up an MR for (1), so you can build off that for (2). And then (3) and (4) can come when those are complete.
I've just opened up an MR for (1)
Nice one; I'm off to Istanbul tomorrow for a few days (for the first time), so will dive-in properly once I get back!
FYI: most of the CSV (and other export/write) tests are ideal candidates for parametric testing (see tests_parametric and the introduction on the original PR for inspiration). I'll definitely add some later, but have a look in the interim if you like :)
@matteosantama: well, got the custom null_value in, and we're now most of the way towards true per-unit datetime formatting... (will have see about refining that further some time).
Glad I could build off your work :)