core-clp: Refactor `LibarchiveFileReader` to properly handle empty data blocks (fixes #389).
Description
The current LibarchiveFileReader implementation has the following problems:
- It doesn't handle the case where the data block is empty, which is the root cause of #389
- Not all data members are initialized in the default constructor
- Buffer peeking method's signature can be improved with C++20 enabled
This PR makes the following changes accordingly:
- Handle the empty data block properly. All empty blocks will be skipped and only nonempty blocks can be read
- Initialize all data members in the member declaration (which matches our latest coding guideline)
- Using
std::spanfor peeking buffer content
Validation performed
- Ensure CI passed
- Ensure the bug has been fixed in #389
A high level comment: Methodlogy wise, I feel the rest of our code tend to let caller check if the data_block is empty, rather having a function to read the next 'none-empty' data block.
That said, I don't see a nit way of doing it unless we let try_load_data_block return a new type of error code and let the caller check it.
One option I might consider is to keep the private method read_next_data_block the unchanged, but add the while loop in try_load_data_block(). This is really a nit though
A high level comment: Methodlogy wise, I feel the rest of our code tend to let caller check if the data_block is empty, rather having a function to read the next 'none-empty' data block.
That said, I don't see a nit way of doing it unless we let
try_load_data_blockreturn a new type of error code and let the caller check it.One option I might consider is to keep the private method
read_next_data_blockthe unchanged, but add the while loop intry_load_data_block(). This is really a nit though
The whole reason why we need this fix is because we don't have any high level code handling empty blocks. And I don't think the high level code should handle empty blocks in this case since the empty blocks should be hidden from the caller. This class is to provide abstraction to read data using libarchive, so the upper level caller shouldn't really care how data blocks are maintained (like there could be an empty data block) in different archives
A high level comment: Methodlogy wise, I feel the rest of our code tend to let caller check if the data_block is empty, rather having a function to read the next 'none-empty' data block. That said, I don't see a nit way of doing it unless we let
try_load_data_blockreturn a new type of error code and let the caller check it. One option I might consider is to keep the private methodread_next_data_blockthe unchanged, but add the while loop intry_load_data_block(). This is really a nit thoughThe whole reason why we need this fix is because we don't have any high level code handling empty blocks. And I don't think the high level code should handle empty blocks in this case since the empty blocks should be hidden from the caller. This class is to provide abstraction to read data using libarchive, so the upper level caller shouldn't really care how data blocks are maintained (like there could be an empty data block) in different archives
A high level comment: Methodlogy wise, I feel the rest of our code tend to let caller check if the data_block is empty, rather having a function to read the next 'none-empty' data block. That said, I don't see a nit way of doing it unless we let
try_load_data_blockreturn a new type of error code and let the caller check it. One option I might consider is to keep the private methodread_next_data_blockthe unchanged, but add the while loop intry_load_data_block(). This is really a nit thoughThe whole reason why we need this fix is because we don't have any high level code handling empty blocks. And I don't think the high level code should handle empty blocks in this case since the empty blocks should be hidden from the caller. This class is to provide abstraction to read data using libarchive, so the upper level caller shouldn't really care how data blocks are maintained (like there could be an empty data block) in different archives
Perhaps I didn't make it clear, the caller here I was suggesting was the caller of read_next_data_block, so it's still within libarchiveReader.
That said, what I was really suggesting was to hide the "non_empty" details from the caller of read_next_data_block..
Instead, the read_next_data_block can handle the empty_block with a while loop. Unless we see a future need to read an empty block, otherwise I don't think we need to expose the details all the way up to try_load_nonempty_data_block, which is called by FileCompessor.
Anyway, it's my personal preference and I will let you decide.
Anyway, it's my personal preference and I will let you decide.
Remind me, Zhihao, did we reach a conclusion on this?
Also, can you resolve the conflicts?
Anyway, it's my personal preference and I will let you decide.
Remind me, Zhihao, did we reach a conclusion on this?
Also, can you resolve the conflicts?
No, we didn't reach a conclusion. I will resolve the conflicts soon