Shapefile.jl icon indicating copy to clipboard operation
Shapefile.jl copied to clipboard

RFC: Catch errors when reading single features

Open meggart opened this issue 3 years ago • 3 comments

Following up on #52 since we are using shx to move from feature to feature in the file, we can in theory also recover when single entries in the file are broken or the file is truncated. This PR would put the reading of the single entries into a try-catch block moves on if an entry is broken. This is not supposed to be a final implementation, I just wanted to get some feedback on if this is something we want or not.

Since this is not really compatible with the sequential reading of entries where it is much harder to recover from bad data (though not impossible), it would make sense to split the sequential read algorithm and the shx-based one into different functions, that are called separately in different use cases. I would do this if there is general consensus that we do care about broken files....

meggart avatar Apr 27 '21 16:04 meggart

This seems like it could be helpful in giving better error messages. I'm not sure how I feel about just returning the non-broken or non-truncated features with only a print or even warn statement. Would like to hear what others think about this.

I would do this if there is general consensus that we do care about broken files....

I guess it depends. To me, if there are some variants that occur often and we can read with small modifications to the code, then definitely yes. Should we be able to read everything that shapelib accepts? Not necessarily, I'm afraid that would add a lot of code and make it harder to maintain.

visr avatar Apr 27 '21 17:04 visr

I'm not sure how I feel about just returning the non-broken or non-truncated features with only a print or even warn statement.

I had the same feeling as well. I think a good solution might be to implement the idea of optional lazy access to the different entries in their shapefile. Then we throw an error if broken entry is accessed and move on for the good ones. Then it would be up to the users to include their try/catch statements according to their needs and they would still be able to access partially broken files.

meggart avatar Apr 28 '21 07:04 meggart

Yes, something like that sounds good to me.

visr avatar Apr 28 '21 07:04 visr