yaml-cpp icon indicating copy to clipboard operation
yaml-cpp copied to clipboard

v0.7.0: invalide node error bug?

Open heshiwen1989 opened this issue 4 years ago • 2 comments

my system: ubuntu 16.04 + Qt 5.14.1 (GCC 5.3.1 20160406 (Red Hat 5.3.1-6), 64 bit) + Qt Creator 4.11.1

(1) my test yaml file content(test.yaml in code)

a:  
    b: value_b
    c: value_c      

(2) test code:

#include <iostream>
#include "yaml-cpp/yaml.h"

using namespace std;

#define RS_ERROR_1   (-1)
#define RS_SUCCESS_1 (1)
#define RS_SUCCESS_2 (2)

#define RS_TEST_MODE (-1)

int main()
{
    const std::string yamlFilePath = "test.yaml";
    YAML::Node configNode = YAML::LoadFile(yamlFilePath);

#if RS_TEST_MODE == RS_ERROR_1
    for(auto iterMap = configNode.begin(); iterMap != configNode.end(); ++iterMap)
    {
        const std::string& key = iterMap->first.as<std::string>();
        const YAML::Node & value = iterMap->second;

        std::cout << "key      = " << key   << std::endl;
        std::cout << "value    = " << value << std::endl;
        std::cout << "nodeType = " << (value.IsMap() ? ("Map") : value.IsSequence() ? ("Seq") : value.IsScalar() ? ("Scalar") : "Unknown") << std::endl;

        if(key == "a")
        {
            int var[100] = {0};
            std::cout << "print value: " << value["b"] << std::endl;  // error
            std::cout << "print value: " << value["c"] << std::endl;
        }
    }
#elif RS_TEST_MODE == RS_SUCCESS_1
    for(auto iterMap = configNode.begin(); iterMap != configNode.end(); ++iterMap)
    {
        const std::string& key = iterMap->first.as<std::string>();
        const YAML::Node & value = iterMap->second;

        std::cout << "key      = " << key   << std::endl;
        std::cout << "value    = " << value << std::endl;
        std::cout << "nodeType = " << (value.IsMap() ? ("Map") : value.IsSequence() ? ("Seq") : value.IsScalar() ? ("Scalar") : "Unknown") << std::endl;

        if(key == "a")
        {
            // int var[100] = {0};
            std::cout << "print value: " << value["b"] << std::endl; // ok
            std::cout << "print value: " << value["c"] << std::endl;
        }
    }
#elif RS_TEST_MODE == RS_SUCCESS_2
    for(auto iterMap = configNode.begin(); iterMap != configNode.end(); ++iterMap)
    {
        const std::string& key = iterMap->first.as<std::string>();
        YAML::Node  value = iterMap->second;

        std::cout << "key      = " << key   << std::endl;
        std::cout << "value    = " << value << std::endl;
        std::cout << "nodeType = " << (value.IsMap() ? ("Map") : value.IsSequence() ? ("Seq") : value.IsScalar() ? ("Scalar") : "Unknown") << std::endl;

        if(key == "a")
        {
            int var[100] = {0};
            std::cout << "print value: " << value["b"] << std::endl;  // ok
            std::cout << "print value: " << value["c"] << std::endl;
        }
    }
#endif

    return 0;
}

we analysis the error that is the const YAML::Node& value is not the reference of iterMap->second, because that the address of variable value and the address of iterMap->second is not the same address; and the local variable var[100]'s memory will cover the value's memory and lead the invalidNode error !

heshiwen1989 avatar Dec 09 '21 01:12 heshiwen1989

Wow, that's definitely a bug. I don't quite follow your reasoning, but if you can send a PR, I'll review it.

jbeder avatar Dec 10 '21 14:12 jbeder

It's basically an invalid memory access when using a reference to the iterator elements.

The smallest reproducer I could come up quickly with is:

$ cat test_yaml.cpp 
#include <yaml-cpp/yaml.h>
#include <iostream>

const auto TEXT = R"(
a:  
    b: c
)";

int
main(int argc, char* argv[])
{
    const auto yaml = YAML::Load(TEXT);

    for (auto it = yaml["a"].begin(); it != yaml["a"].end(); ++it) {
        const auto& key = it->first;
        key.as<std::string>();
    }
}

Build with ASAN:

$ clang++ -g -fsanitize=address -o /tmp/test_yaml test_yaml.cpp -lyaml-cpp

And see the invalid access to the stack:

$ /tmp/test_yaml 
=================================================================
==4159442==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffdebaf78e0 at pc 0x0000004c95e2 bp 0x7ffdebaf7610 sp 0x7ffdebaf7608
READ of size 1 at 0x7ffdebaf78e0 thread T0
    #0 0x4c95e1 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > YAML::Node::as<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >() const /usr/include/yaml-cpp/node/impl.h:144:8
    #1 0x4c8155 in main /scratch/src/workspace/stuff/misc/test_yaml.cpp:16:13
    #2 0x7fc5b7e1a0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x240b2)
    #3 0x41d5cd in _start (/tmp/test_yaml+0x41d5cd)

Address 0x7ffdebaf78e0 is located in stack of thread T0 at offset 416 in frame
    #0 0x4c7d6f in main /scratch/src/workspace/stuff/misc/test_yaml.cpp:11

  This frame has 7 object(s):
    [32, 64) 'yaml' (line 12)
    [96, 144) 'it' (line 14)
    [176, 208) 'ref.tmp' (line 14)
    [240, 288) 'ref.tmp3' (line 14)
    [320, 352) 'ref.tmp4' (line 14)
    [384, 480) 'ref.tmp15' (line 15) <== Memory access at offset 416 is inside this variable
    [512, 544) 'agg.tmp.ensured'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-scope /usr/include/yaml-cpp/node/impl.h:144:8 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > YAML::Node::as<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >() const
Shadow bytes around the buggy address:
  0x10003d756ec0: 00 00 00 00 f1 f1 f1 f1 f8 f3 f3 f3 00 00 00 00
  0x10003d756ed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10003d756ee0: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
  0x10003d756ef0: f2 f2 f2 f2 00 00 00 00 00 00 f2 f2 f2 f2 f8 f8
  0x10003d756f00: f8 f8 f2 f2 f2 f2 f8 f8 f8 f8 f8 f8 f2 f2 f2 f2
=>0x10003d756f10: f8 f8 f8 f8 f2 f2 f2 f2 f8 f8 f8 f8[f8]f8 f8 f8
  0x10003d756f20: f8 f8 f8 f8 f2 f2 f2 f2 00 00 00 00 f3 f3 f3 f3
  0x10003d756f30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10003d756f40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10003d756f50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10003d756f60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==4159442==ABORTING

This doesn't happen when making a copy of the iterator members, e.g.:

        const auto key = it->first;

I haven't looked at the code in detail but I assume that it must be due to the use of some sort of proxy class or something? https://github.com/jbeder/yaml-cpp/blob/a6bbe0e50ac4074f0b9b44188c28cf00caf1a723/include/yaml-cpp/node/detail/iterator.h

It's quite annoying because this used to work with older versions - e.g. this code is fine with 0.5.2 but breaks badly with 0.6.2.

cf-natali avatar May 30 '22 11:05 cf-natali