Add end of buffer checks for invalid binary
using https://github.com/stephenberry/glaze/commit/574afa831a89db768ac6602bc0da3562824edf07
#include <iostream>
#include <vector>
#include <glaze/glaze.hpp>
int main()
{
const int write_value = 32;
const std::vector<std::byte> write_buffer = [&]() {
std::vector<std::byte> b{};
glz::write_binary(write_value, b);
return b;
}();
auto check = [&](auto read_buffer) {
int read_value;
auto ec = glz::read_binary(read_value, read_buffer);
if (!ec && read_buffer.size() >= ec.location && write_value == read_value) {
std::cout << "ok\n";
}
else {
if (!ec && read_buffer.size() >= ec.location) {
std::cout << "ok probably...\n";
}
else if (!ec && read_buffer.size() < ec.location) {
std::cout << "bad over read\n";
}
else {
std::cout << "ok error\n";
}
}
};
for (int i = 0; i <= write_buffer.size(); ++i) {
std::vector<std::byte> read_buffer{};
std::copy(write_buffer.begin(), write_buffer.begin() + i, std::back_inserter(read_buffer));
std::cout << i << '\n';
check(read_buffer);
read_buffer.resize(write_buffer.size());
check(read_buffer);
}
return 0;
}
g++ --version
g++ (GCC) 13.2.1 20230801
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
g++ main.cpp -std=c++23 && ./a.out
0
ok error
ok error
1
bad over read
ok probably...
2
bad over read
ok
3
bad over read
ok
4
bad over read
ok
5
ok
ok
The returned location in the 'ec.location' indicates that the function read more bytes then the buffer was in size.
So, the BEVE code currently expects valid data. Your example shows how out of bound reads will occur for invalid data.
I tend to think checks should be added to ensure we don't read out of bounds. However, these errors should only occur if data is being incorrectly written, corrupted, or maliciously manipulated. We don't expect data to be incorrectly written, because we don't expect humans to be writing binary. Corruption and malicious manipulation can be handled through checksums and other security mechanisms. So, adding checks everywhere for incorrect serialization hasn't been a concern.
I'm curious under what condition you are either experiencing this problem or worried about it occurring?
I do think it is worth adding end buffer checks because the overhead is not very high and it adds another layer of protection. But, it just hasn't been a problem, and thus not addressed yet.
Adding more end of buffer and invalid binary checks in #945
We do need full end of buffer checking for open APIs, so this issue will be fully addressed in the future. In the meantime I have added warning about using BEVE in open contexts to the documentation (binary.md).
End of buffer checking is now incorporated with the changes that no longer require null termination.