gac_pod reader is parsing padding bytes as scanlines
I am seeing a lot of warnings reading GAC POD files due to incorrect number of scanlines. As far as I can tell this because the POD reader does not distinguish physical and logical records. From the POD user manual, GAC files store the data in "physical" records (6440 bytes) each containing two "logical" records (3220 bytes)
-
The header is written to the first physical record. As gac_pod.py is using
self.offset = 3220it will attempt to parse the header padding as an extra scanline. -
Each scanline occupies one logical record (i.e. two scans per physical record). If the dataset contains an odd number of scanlines the final physical record will contain padding for the final logical record.
Pygac is currently reading to the end of the file, so all POD files will produce at least 1 or 2 extra scanlines - though these should be removed by correct_scan_line_numbers
This would be easy to fix in the reader, but are there any cases where pygac is reading useful data beyond the expected end of the file?
Interesting find! So you're saying the first scanline is always invalid because it contains the spare 3220 header bytes?
I agree that it is probably removed by correct_scan_line_numbers or invalid coordinates or other checks. Nevertheless it would be nice to get rid of these warnings.
are there any cases where pygac is reading useful data beyond the expected end of the file?
None that I'm aware of
Yes - the first "scanline" is just the spare header bytes (and similarly for the last scanline read if there are an odd number of actual scanlines)