json icon indicating copy to clipboard operation
json copied to clipboard

BAD_ALLOC exception after using parse with float value

Open clauderobi opened this issue 2 years ago • 9 comments

Description

On a ARM board (raspberry Pi 0), parsing a json string that includes a float value causes a memory corruption. Any code after the parse that uses the heap will throw a bad_alloc exception. Running the program on x86 runs fine.

Reproduction steps

Simply parse a string that contains a float value and then emplace data in a unordered_map container.

Expected vs. actual results

The emplace method call should success. But instead it throws a bad_alloc exception.

Minimal code example

#include <iostream>
#include <string>
#include <unordered_map>
#include <nlohmann/json.hpp>

#define TT "{\"buildtime_bin\":1651934971.9987135}"

int main(int argc, char** argv) {
  std::string                                          manifestData = TT;
  nlohmann::json                                       manifest = nlohmann::json::parse(manifestData);
  std::cout << manifest.dump() << std::endl;

  std::unordered_map<std::string, std::string>         testMap;

  try {
    testMap.emplace(std::make_pair("b", "abcd"));
    std::cout << "Good" << std::endl;
  }
  catch(std::exception& exc) {
    std::cout << exc.what() << std::endl;
  }

}

Error messages

std::bad_alloc is output by my exception handler

Compiler and operating system

Compiler is armv6-rpi-linux-gnueabi-gcc (crosstool-NG 1.24.0) 8.3.0 and OS is Raspbian GNU/Linux 11 (bullseye)

Library version

3.10.5

Validation

clauderobi avatar May 07 '22 15:05 clauderobi

Please check if the bug occurs with the latest version from the develop branch and run the unit tests as per the instructions in the issue template.

falbrechtskirchinger avatar May 07 '22 15:05 falbrechtskirchinger

I cannot reproduce your issue compiling natively with aarch64-unknown-linux-gnu-g++-9.2.0 and running on a RaspberryPi or cross-compiling with armv7a-unknown-linux-gnueabihf-g++-11.3.0 and running with qemu-arm.

falbrechtskirchinger avatar May 07 '22 16:05 falbrechtskirchinger

Same result with the latest version. I ran the unit tests and there is a lot of failures. Including SIGSEGV, exceptions being thrown and is NOT correct errors.

Since I cross compile I did not use ctest but ran the tests manually (well, using a bash script...) and capture the output in a text file. Since there are almost 2000 lines I am not sure I want to paste it here.

As said in my original post, it is raspberry Pi zero, so the arm variant is armv6

clauderobi avatar May 07 '22 17:05 clauderobi

Forgot to mention that when I target x86 I still cross-compile. The compiler in this case is x86_64-unknown-linux-gnu-gcc (crosstool-NG 1.24.0) 8.3.0 So it is the same version as for the armv6 cross-compiler (same crosstool-ng too).

clauderobi avatar May 07 '22 17:05 clauderobi

Seems like your toolchain is broken. Maybe you're not linking against the correct standard library? Maybe it's an ABI problem (mixing incompatible objects after PSABI change)? Can you try a different toolchain?

falbrechtskirchinger avatar May 07 '22 18:05 falbrechtskirchinger

Well, after changing the float to an int (which in fact is was I really needed anyway), my real program works perfectly so far (but I still have things to check, which may take a few days). The program uses threads, lots of container and hence lots of heap usage, mutexes and atomic, fork and execv, etc. so quite a bit of interaction with the kernel (and the c library).

The program is statically linked, except for glibc (libc, libm and libpthread) and every libraries I use is compiled with the same toolchain. So what is really left if glibc itself (and companions). The TC is configured to use 2.28. The target raspberry is running debian bullseye, which comes with 2.31. By virtue of the backward compatibility in glibc, I am under the impression that a newer glibc must support a program linked again an older glibc. But may be I am wrong.

I doubt that there is an ABI change between 2.28 and 2.31. But how to check?

clauderobi avatar May 08 '22 22:05 clauderobi

I doubt that it is a glibc ABI issue as well. The C++ standard library isn't part of glibc, though, but GCC. And I'm aware of at least one breaking ABI change in GCC 6 or 7. Assuming all parts of your program are built with the same GCC, the only thing I can think of is that you might be linking against an older C++ standard library.

Have you tried using ASAN or Valgrind to get a better idea of where the heap corruption might occur?

falbrechtskirchinger avatar May 09 '22 04:05 falbrechtskirchinger

The C++ library is linked statically using -static-libstdc++

I tried to use valgrind and do get lots of errors. But I get almost as many errors with the program that does not throw a bad_alloc than the one that throws. (9K something vs 10k something). Most if not all of the errors are said to be in libc or ld, mostly uninitialized data. I also ran valgrind on ls and got over 17K errors in this case.

clauderobi avatar May 10 '22 20:05 clauderobi

@clauderobi Any update on this?

nlohmann avatar Jul 22 '22 17:07 nlohmann