pyglider icon indicating copy to clipboard operation
pyglider copied to clipboard

sea explorer white space in payload files

Open clayton33 opened this issue 11 months ago • 14 comments

We have a number of missions from year 2018 with navigation firmware 2.1.5 and payload firmware 2.1.1. In the raw payload files there are leading whitespaces in both the data and the header which resulted in an error being thrown when converting the data to Float64. This fix resolves that issue.

clayton33 avatar Jan 08 '25 13:01 clayton33

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 54.98%. Comparing base (85011e5) to head (0ca4ef7). Report is 12 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #215      +/-   ##
==========================================
+ Coverage   54.90%   54.98%   +0.07%     
==========================================
  Files           9        9              
  Lines        1692     1695       +3     
==========================================
+ Hits          929      932       +3     
  Misses        763      763              

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Jan 08 '25 13:01 codecov[bot]

Can you explain the data that causes this error a little more explicitly. The added code looks potentially slow and I wonder if there is a better way to strip leading white space available in polars.

jklymak avatar Jan 08 '25 15:01 jklymak

exampleData.zip Attached is some example data.

clayton33 avatar Jan 08 '25 16:01 clayton33

Thanks, but what about this data is causing the error? Can you just extract the string that is causing the problem so we can look for solutions?

jklymak avatar Jan 08 '25 16:01 jklymak

There are leading whitespaces throughout the entire file and it's in every payload file, so I think looking at every column that is a str type makes sense. Any files after that navigation and payload firmware don't have that issue, and they'll read in as float64, so these newly added lines will ultimately do nothing in those cases since there won't be any columns of str type.

clayton33 avatar Jan 08 '25 16:01 clayton33

Can you include here a string that has a leading white space? I opened a couple of those files and I'm not seeing any leading whitespaces, but maybe we are talking about different things.

jklymak avatar Jan 08 '25 17:01 jklymak

Also, maybe this is a windows/Unix thing? I'm on a Mac and don't see any leading spaces. Perhaps we need to specify the encoding of the files when we open them. It's super unlikely that polars can't handle white space in float conversion, so I think it must be something like that.

jklymak avatar Jan 08 '25 17:01 jklymak

Sorry for the confusion. I sent data for the wrong mission (we had mission 32 in the same year for two different gliders).

exampleData.zip

clayton33 avatar Jan 08 '25 18:01 clayton33

@clayton33 can you also post the error that you get when running with our yaml?

richardsc avatar Jan 08 '25 18:01 richardsc

This is the error that I get

polars.exceptions.InvalidOperationError: conversion from `str` to `f64` failed in column ' NAV_RESOURCE' for 58 out of 58 values: [" 105", " 105", … " 117"]

clayton33 avatar Jan 08 '25 18:01 clayton33

https://github.com/pola-rs/polars/issues/10587 maybe

I'm actually pretty surprised polars has decided not to have an option for this. @callumrollo ?

jklymak avatar Jan 08 '25 18:01 jklymak

Yes, I had come across that issue. I followed the suggestion from this comment https://github.com/pola-rs/polars/issues/10587#issuecomment-2116459477

clayton33 avatar Jan 08 '25 18:01 clayton33

Checking in to see if there has been any thought into incorporating this change.

clayton33 avatar Feb 07 '25 13:02 clayton33

Its fine by me, but would be good if @callumrollo chimed in

jklymak avatar Feb 07 '25 17:02 jklymak

Hi @jklymak, @callumrollo! Any thoughts on merging this PR? We've been using it in production and it all seems good on our files (with and without white space).

richardsc avatar May 22 '25 20:05 richardsc

Thanks!

richardsc avatar May 23 '25 01:05 richardsc

Yes sorry to be slow. Always feel free to ping!

jklymak avatar May 23 '25 02:05 jklymak