polars icon indicating copy to clipboard operation
polars copied to clipboard

feat(rust): Trim leading and trailing whitespaces while parsing csv data

Open xgillard opened this issue 3 years ago • 4 comments

I often find myself in the situation where I want to parse csv data which have been vertically aligned in the text file (because it makes the files prettier thus easier to read in terminal). However, these leading and trailing spaces come in the way of properly parsing numeric values with polars. This is why I propose to add a switch that lets us configure a way to drop these leading and trailing spaces.

Example: Parsing the following data would not work before edit but it could be accepted with this PR

INSTANCE     | STATUS | UB   | LB                   | DURATION | PEAK-MEMORY
05items/005 | Proved   | 1471 | 1470.998622 | 0                 | 0.004846898839
05items/033 | Proved   | 1104 | 1103.999        | 0                 | 0.004846531898
05items/024 | Proved   | 1308 | 1307.998815 | 0                 | 0.004889146425
05items/018 | Proved   |  1315 | 1314.998748 | 0                 | 0.004864221439
05items/006 | Proved   | 1386 | 1385.901602 | 0                 | 0.004862984642
05items/038 | Proved   |  1113 | 1112.999091  | 0                 | 0.004888778552
05items/034 | Proved   | 1108 | 1107.999018  | 0                 | 0.004848460667
05items/030 | Proved   |  588 | 587.9995413 | 0                 | 0.004298817366
05items/010 | Proved   | 1952 | 1951.998264 | 0                 | 0.004889526404

xgillard avatar May 11 '22 10:05 xgillard

@xgillard you have not updated this branch. I think you updated another one in your fork

ricglz avatar May 12 '22 22:05 ricglz

ouch... I'm so sorry for wasting your time on this one :-/

xgillard avatar May 13 '22 02:05 xgillard

Thanks for the PR. Can you expose those options also to the python api and the lazy scanner?

ritchie46 avatar May 13 '22 05:05 ritchie46

thx for your reply, I'll do it 👍

xgillard avatar May 16 '22 08:05 xgillard

We've decided that this is not a feature we want, for more details see https://github.com/pola-rs/polars/pull/10333#issuecomment-1669291968.

orlp avatar Aug 08 '23 09:08 orlp