las-rs icon indicating copy to clipboard operation
las-rs copied to clipboard

Test for EVLR offset is not valid for laz

Open ktmattson-alion opened this issue 3 years ago • 14 comments

This check:

if evlr.start_of_first_evlr < offset_to_end_of_points {
    return Err(Error::OffsetToEvlrsTooSmall(evlr.start_of_first_evlr).into());
} // .. //

will give a false positive on compressed files, since evlr.start_of_first_evlr refers to the file as-is, while

pub fn offset_to_end_of_points(&self) -> u64 {
        u64::from(self.offset_to_point_data)
            + u64::from(self.number_of_point_records) * u64::from(self.point_data_record_length)
    }

is the size of the uncompressed point data records.

For comparison (because I don't know nothing about nothing myself,) M Isenburg's LAStools just goes for it: stream->seek(header.start_of_first_extended_variable_length_record); ref So I think it's safe to say this check is extraneous.

ktmattson-alion avatar Jun 08 '21 21:06 ktmattson-alion

I agree. Do you have any ~real-world~ examples of this we can use for a regression test?

EDIT: Any examples are fine doesn't have to be real data :-).

gadomski avatar Jun 09 '21 13:06 gadomski

I was using this data from USGS: https://drive.google.com/file/d/105-DlPnyrS086Etx4epJFIQM0qPk9Ayx/view?usp=sharing super real world :)

ktmattson-alion avatar Jun 09 '21 18:06 ktmattson-alion

Great, thanks! Unfortunately this file's too big to include in the repo as a test case. I've tried filtering it down with PDAL, but PDAL seems to convert the EVLR(s) to VLR(s). If you are able to create a smaller (<1MB preferred) version of this file or another one, that'd help a lot. I'm going to keep trying to create a dummy laz file w/ an EVLR to use as a test case, but so far haven't had any success.

gadomski avatar Jun 09 '21 21:06 gadomski

I was afraid of that. You're creating a laz with this crate for testing?

ktmattson-alion avatar Jun 10 '21 16:06 ktmattson-alion

You're creating a laz with this crate for testing?

Trying to. Turns out it's non-trivial to construct an laz w/ EVLRs with the current tooling ecosystem.

gadomski avatar Jun 10 '21 16:06 gadomski

Yeah, 1.4 is a lot more complex than its predecessors. Our use case doesn't include creating new files, only reading them, but I'm already pretty annoyed to have to write a WKT-CRS parser because of the EVLRS change.

ktmattson-alion avatar Jun 10 '21 21:06 ktmattson-alion

centimated.zip LasTools lets you preserve every nth point. Here's the same file but 1/100 points.

ktmattson-alion avatar Jun 10 '21 21:06 ktmattson-alion

Thanks, that's perfect, except ... for some reason the centimated file doesn't trigger the expected error. Not sure why not yet, I'll dig in. But we should be able to get this to work (I checked and the centimated file did have EVLRs).

Screen Shot 2021-06-10 at 4 34 27 PM

gadomski avatar Jun 10 '21 22:06 gadomski

I took a quick look at it, it seems that the centimated file does not trigger the error because its a LAS 1.4 with the legacy point count (u32) set to 0 and the new u64 point count set to the good value.

The computation done by offset_to_end_of_points only uses the u32 value whatever the version of the file is. So in this case the computed offset is equal to the offseto to the first point which is less than the offset to the last point.

(The original file is also LAS 1.4, but with both the legacy and non legacy point count set)

tmontaigu avatar Jun 23 '21 10:06 tmontaigu

Nice catch. The spec says:

A LAS 1.4 file writer who wishes to maintain backward compatibility must maintain both the
legacy fields and the equivalent non-legacy fields in synchronization. Of course, this is not
possible if the number of points exceeds UINT32_MAX, in which case the legacy fields must be
set to zero. If a file writer is not maintaining backward compatibility, the legacy fields must always
be set to zero. 

"Gentle deprecation" is so irritating.

ktmattson-alion avatar Jun 23 '21 13:06 ktmattson-alion

but in this case, its the offset_to_end_of_points that is not 100% correct

tmontaigu avatar Jun 23 '21 13:06 tmontaigu

For sure. I'm just chafing at spec writers who leave in "if you feel like it" clauses.

ktmattson-alion avatar Jun 23 '21 14:06 ktmattson-alion

Ok, so, just to regroup here, is this a correct assessment of the bug: IF the file is compressed AND has been written by a backwards-compatible type writer (u32 point count is non-zero) THEN offset_to_end_of_points will be greater than evlr.start_of_first_evlr (Skeng)

ktmattson-alion avatar Jun 24 '21 14:06 ktmattson-alion

yes

tmontaigu avatar Jun 25 '21 09:06 tmontaigu