tv icon indicating copy to clipboard operation
tv copied to clipboard

Catch UnequalLengths

Open HarHarLinks opened this issue 3 years ago • 9 comments

I've come across this csv:

$ curl -s https://opendata.dwd.de/weather/local_forecasts/swsmos/swsmos_LATEST_opendata.csv.bz2 | bzcat | head
ID;Lat;Lon;YYYYMMDDHHmm;TL;TLSTA;RRL1c;RRS1c;RR6;WWL6;WWS3;RRS3c;R650;RC;TS;TD
202207031100
A006;54.88920;8.90870;202207031200;22.8;0.8;0.0;0.0;0.0;6.0;0.0;0.0;0.0;1;45.27;18.8
A006;54.88920;8.90870;202207031300;23.3;2.1;0.0;0.0;0.0;4.0;0.0;0.0;0.0;1;44.33;18.0
A006;54.88920;8.90870;202207031400;23.1;3.0;0.0;0.0;0.0;8.0;0.0;0.0;0.0;1;42.62;17.5
[...]

tv does not like it due to the second line being just an ISO-ish date string with missing data:

$ curl -s https://opendata.dwd.de/weather/local_forecasts/swsmos/swsmos_LATEST_opendata.csv.bz2 | bzcat | head | tidy-viewer -s ';'
thread 'main' panicked at 'a csv record: Error(UnequalLengths { pos: Some(Position { byte: 79, line: 2, record: 1 }), expected_len: 16, len: 1 })', src/main.rs:354:20

While the csv is clearly at fault, I expect this isn't all that unusual. I would like tv to be able to

  • at a minimum, have an option to just ignore (skip) faulty lines and continue
  • better, note the error in the line and leave it unformatted or similar, potentially highlighting it in a way? e.g. make the line red with a âš  symbol.

For now, I've added | awk 'NR != 2' into my pipe to skip the 2nd line explicitly.

HarHarLinks avatar Jul 03 '22 12:07 HarHarLinks

Thanks for the great issue. I am able to reproduce this with:

curl -s https://opendata.dwd.de/weather/local_forecasts/swsmos/swsmos_LATEST_opendata.csv.bz2 | bzcat | head | awk 'NR != 2'| tidy-viewer -s ';'

Ill formatted csvs have been on the ticket for a while. It is probably time to tackle the problem.

alexhallam avatar Jul 03 '22 14:07 alexhallam

Just giving a little update.

This is where I will start the error handling

https://github.com/alexhallam/tv/blob/e8beee0cfa0ffaa8a3f5b8f5207e9e6fc7f31686/src/main.rs#L367

There will likely be additional formatting needed for broken lines, but that is the start of it.

RepEx: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=35ea71ae858972a0f7a2bcb9283cb54f

alexhallam avatar Jul 10 '22 14:07 alexhallam

Also wanted to keep this #79 in the loop for this fix.

alexhallam avatar Jul 10 '22 14:07 alexhallam

Also, I could possibly resurrect #124

alexhallam avatar Jul 10 '22 14:07 alexhallam

And maybe #91

alexhallam avatar Jul 10 '22 14:07 alexhallam

leaving a bookmark https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=35ea71ae858972a0f7a2bcb9283cb54f

alexhallam avatar Jul 11 '22 01:07 alexhallam

and #137

alexhallam avatar Jul 20 '22 15:07 alexhallam

some more progress with error handeling

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=10648de5f40ce5886ca2ad4832234d59

alexhallam avatar Jul 21 '22 12:07 alexhallam

Here I put the error in as a vector. maybe I can handle strings that look like errors in the printout

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f7bb8d643148b903f99260d40c244f05

alexhallam avatar Jul 21 '22 12:07 alexhallam