polars icon indicating copy to clipboard operation
polars copied to clipboard

read macros

Open universalmind303 opened this issue 1 year ago • 2 comments

Closes #8104

there's still quite a bit more work to do on this, but wanted to make sure this aligns with what you were thinking @ritchie46.

Since there isnt much wrapping of Dataframe/Series (other than the options & reader builder), I thought it might be overkill to add a new crate polars-api, and we could keep these macros inside the polars crate.

Checklist

  • [x] read_csv<P: Into<PathBuf>>
  • [x] read_csv<R: Read>
  • [x] scan_csv
  • [ ] read_json<P: Into<PathBuf>>
  • [ ] read_json<R: Read>
  • [ ] read_ndjson<P: Into<PathBuf>>
  • [ ] read_ndjson<R: Read>
  • [ ] scan_ndjson
  • [ ] read_parquet<P: Into<PathBuf>>
  • [ ] read_parquet<R: Read>
  • [x] scan_parquet
  • [ ] read_ipc<P: Into<PathBuf>>
  • [ ] read_ipc<R: Read>
  • [ ] scan_ipc

Some Examples

let df = polars::read_csv!("path/to/csv")?;
// The macros support python-like keyword arguments 
let df = polars::read_csv!("path/to/csv", delimiter = b',')?;
let df = polars::read_csv!("path/to/csv", has_header = true)?;
// position doesn't matter
let df = polars::read_csv!("path/to/csv", has_header = true, delimiter = b',')?;
let df = polars::read_csv!("path/to/csv", delimiter = b',', has_header = true)?;

let lf = polars::scan_csv!("path/to/csv")?;

// or from a pathbuf
let df = polars::read_csv!(&path)?;

universalmind303 avatar Apr 13 '23 21:04 universalmind303

A bit of feedback from an outsider. I don't really think, that this should be the way to go. With this you would have an API full of macros just for the sake of keyword-like/named, optional, non-positional arguments.

You could check how WGPU does it, with their descriptors: https://docs.rs/wgpu/latest/wgpu/struct.RenderPipelineDescriptor.html

Real-world example

Discovered this issue through this reddit thread.

The nice thing about descriptors here is that you can essentially also impl Default and have default parameters, by passing DescriptorX::default() or you can initialize it with the values you need and use .. Default::default() for the rest. It's much more versatile than to use macros everywhere, IMHO.

Besides that, I feel having too many (positional, optional) parameters to a function could be thought of as a code smell and a method/function doing too much. Feel free to chime in to the discussion here: https://github.com/rust-unofficial/patterns/discussions/239

simonsan avatar Jun 21 '23 11:06 simonsan

@ritchie46 I finally had some time to revisit this one. I've limited it to read_csv scan_csv and scan_parquet for now. If this looks good, I can submit some follow up PR's for the other read/scan functions

universalmind303 avatar Jul 14 '23 14:07 universalmind303

I am closing this PR as it hasn't been updated in a while and has gathered a bunch of conflicts.

Feel free to rebase & reopen. However, I do feel the same as @simonsan in that I don't think macros are the way to go. I commented on the linked issue.

stinodego avatar Feb 08 '24 13:02 stinodego