fix: Deprecate SIMD version of SkipWhitespace to prevent BOF
Hi all, I'm Junwha Hong, from the research group S2Lab in UNIST, we found a buffer-overflow from rapidjson by our custom tool, and patched it. the detailed information is as follows.
Summary
- OS: Ubuntu 22.04
- version: f9d53419e912910fd8fa57d5705fa41425428c35
- BOF occurs during the schematest reads "additionalItems.json"
- at include/rapidjson/reader.h:396
Problem Statement
The buffer overflow arises when the reader utilizes RAPIDJSON_SIMD and a filestream which has no specification of length or end.
SkipWhitespace_SIMD(const char* p) checks whether all the char characters of 16-bytes matches the whitespaces characters. thus, it will escape the for loop if there is a null string inside a 16-bytes batch. The problem occurs here, because null string is not always placed at the end of the 16-bytes. thus, if the null string is placed at the 16-bytes aligned address, there will be loads of 15 invalid bytes.
inline const char *SkipWhitespace_SIMD(const char* p) {
// Fast return for single non-whitespace
if (*p == ' ' || *p == '\n' || *p == '\r' || *p == '\t')
++p;
else
return p;
// 16-byte align to the next boundary
const char* nextAligned = reinterpret_cast<const char*>((reinterpret_cast<size_t>(p) + 15) & static_cast<size_t>(~15));
while (p != nextAligned)
if (*p == ' ' || *p == '\n' || *p == '\r' || *p == '\t')
++p;
else
return p;
// The rest of string
#define C16(c) { c, c, c, c, c, c, c, c, c, c, c, c, c, c, c, c }
static const char whitespaces[4][16] = { C16(' '), C16('\n'), C16('\r'), C16('\t') };
#undef C16
const __m128i w0 = _mm_loadu_si128(reinterpret_cast<const __m128i *>(&whitespaces[0][0]));
const __m128i w1 = _mm_loadu_si128(reinterpret_cast<const __m128i *>(&whitespaces[1][0]));
const __m128i w2 = _mm_loadu_si128(reinterpret_cast<const __m128i *>(&whitespaces[2][0]));
const __m128i w3 = _mm_loadu_si128(reinterpret_cast<const __m128i *>(&whitespaces[3][0]));
for (;; p += 16) {
const __m128i s = _mm_load_si128(reinterpret_cast<const __m128i *>(p));
__m128i x = _mm_cmpeq_epi8(s, w0);
x = _mm_or_si128(x, _mm_cmpeq_epi8(s, w1));
x = _mm_or_si128(x, _mm_cmpeq_epi8(s, w2));
x = _mm_or_si128(x, _mm_cmpeq_epi8(s, w3));
unsigned short r = static_cast<unsigned short>(~_mm_movemask_epi8(x));
if (r != 0) { // some of characters may be non-whitespace
#ifdef _MSC_VER // Find the index of first non-whitespace
unsigned long offset;
_BitScanForward(&offset, r);
return p + offset;
#else
return p + __builtin_ffs(r) - 1;
#endif
}
}
}
Patch
The SkipWhitespace_SIMD(const char* p, const char* end) checks sanity safely by for (; p <= end - 16; p += 16). thus we need to deprecate the SIMD feature for the simple StringStream and InsituStringStream which have no end specification.
Callstack at BOF
#0 0x55ecfcc0ab5d in rapidjson::SkipWhitespace_SIMD(char const*) /home/qbit/rapidjson/include/rapidjson/reader.h:303:27
#1 0x55ecfcc0ab5d in void rapidjson::SkipWhitespace<rapidjson::GenericStringStream<rapidjson::UTF8<char>>>(rapidjson::GenericStringStream<rapidjson::UTF8<char>>&) /home/qbit/rapidjson/include/rapidjson/reader.h:511:15
#2 0x55ecfcc0ab5d in void rapidjson::GenericReader<rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::CrtAllocator>::SkipWhitespaceAndComments<0u, rapidjson::GenericStringStream<rapidjson::UTF8<char>>>(rapidjson::GenericStringStream<rapidjson::UTF8<char>>&) /home/qbit/rapidjson/include/rapidjson/reader.h:712:9
#3 0x55ecfcc0ab5d in rapidjson::ParseResult rapidjson::GenericReader<rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::CrtAllocator>::Parse<0u, rapidjson::GenericStringStream<rapidjson::UTF8<char>>, rapidjson::GenericDocument<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator>, rapidjson::CrtAllocator>>(rapidjson::GenericStringStream<rapidjson::UTF8<char>>&, rapidjson::GenericDocument<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator>, rapidjson::CrtAllocator>&) /home/qbit/rapidjson/include/rapidjson/reader.h:579:17
#4 0x55ecfcc0a4cd in rapidjson::GenericDocument<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator>, rapidjson::CrtAllocator>& rapidjson::GenericDocument<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator>, rapidjson::CrtAllocator>::ParseStream<0u, rapidjson::UTF8<char>, rapidjson::GenericStringStream<rapidjson::UTF8<char>>>(rapidjson::GenericStringStream<rapidjson::UTF8<char>>&) /home/qbit/rapidjson/include/rapidjson/document.h:2646:40
#5 0x55ecfcc96e96 in rapidjson::GenericDocument<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator>, rapidjson::CrtAllocator>& rapidjson::GenericDocument<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator>, rapidjson::CrtAllocator>::Parse<0u, rapidjson::UTF8<char>>(rapidjson::UTF8<char>::Ch const*) /home/qbit/rapidjson/include/rapidjson/document.h:2711:16
#6 0x55ecfcc96e96 in rapidjson::GenericDocument<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator>, rapidjson::CrtAllocator>& rapidjson::GenericDocument<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator>, rapidjson::CrtAllocator>::Parse<0u>(char const*) /home/qbit/rapidjson/include/rapidjson/document.h:2720:16
#7 0x55ecfcc96e96 in rapidjson::GenericDocument<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator>, rapidjson::CrtAllocator>::Parse(char const*) /home/qbit/rapidjson/include/rapidjson/document.h:2727:16
#8 0x55ecfcc96e96 in Schema::SetUp() /home/qbit/rapidjson/test/perftest/schematest.cpp:103:15
#9 0x55ecfcda2ad5 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/qbit/rapidjson/thirdparty/gtest/googletest/src/gtest.cc:2418:10
#10 0x55ecfcda2ad5 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/qbit/rapidjson/thirdparty/gtest/googletest/src/gtest.cc:2454:14
#11 0x55ecfcd2c469 in testing::Test::Run() /home/qbit/rapidjson/thirdparty/gtest/googletest/src/gtest.cc:2488:3
#12 0x55ecfcd31336 in testing::TestInfo::Run() /home/qbit/rapidjson/thirdparty/gtest/googletest/src/gtest.cc:2668:11
#13 0x55ecfcd32c96 in testing::TestCase::Run() /home/qbit/rapidjson/thirdparty/gtest/googletest/src/gtest.cc:2786:28
#14 0x55ecfcd62463 in testing::internal::UnitTestImpl::RunAllTests() /home/qbit/rapidjson/thirdparty/gtest/googletest/src/gtest.cc:5048:43
#15 0x55ecfcda5031 in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/qbit/rapidjson/thirdparty/gtest/googletest/src/gtest.cc:2418:10
#16 0x55ecfcda5031 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/qbit/rapidjson/thirdparty/gtest/googletest/src/gtest.cc:2454:14
#17 0x55ecfcd613e9 in testing::UnitTest::Run() /home/qbit/rapidjson/thirdparty/gtest/googletest/src/gtest.cc:4664:10
#18 0x55ecfcbdee12 in RUN_ALL_TESTS() /home/qbit/rapidjson/thirdparty/gtest/googletest/include/gtest/gtest.h:2329:46
#19 0x55ecfcbdee12 in main /home/qbit/rapidjson/test/perftest/perftest.cpp:23:12
#20 0x7f6a64c29d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#21 0x7f6a64c29e3f in __libc_start_main csu/../csu/libc-start.c:392:3
#22 0x55ecfcade754 in _start (/home/qbit/rapidjson/build2/bin/perftest+0x99754)
Thank you.
This pattern also occurs for ScanCopyUnescapedString(StringStream& is, StackStream<char>& os) in reader.h. _mm_load_si128 instruction should be used when the length is available.
It is the root cause of the bug fix in #2101 , but the patch only fixed the part of user side codes, and still some test cases are leading to buffer-overflow. Any user who naively uses RapidJSON's SIMD feature without specifying the length for a file which has non-16-bytes aligned end address will suffer from buffer-overflow. thus, it is needed to deprecate the SIMD API for the non-length specified use cases.
If you want to keep the way you suggested before in #1723, I will just align the end of buffers for the problematic tests. but I still recommend to provide only end-specified version for SIMD.