rapidyaml icon indicating copy to clipboard operation
rapidyaml copied to clipboard

Core dump occurs when parsing YAML with invalid syntax

Open serbayozkan opened this issue 6 months ago • 6 comments

Hello,

In some cases where the input YAML has syntax errors, parsing causes core dumps. When this happens, as expected, it is not possible to catch it from the client code. It actually prints the error before it crashes, so it would be great to prevent the core dump and instead throw an exception or trigger an error callback to inform the client about the failure. This way, the client could know what is going on instead of getting a crashed application.

I wrote some failing examples below, so I would like to hear your feedback on this issue.

Version: 0.8.0

char yaml[] = R"(
data: [, 18 34]
)";

ryml::Tree tree = ryml::parse_in_place(yaml);

Output:
2:8: (8B): ERROR: parse error
2:8: data: [, 18 34]  (size=15)
            ^~~~~~~~  (cols 8-16)

Aborted (core dumped)
char yaml[] = R"(
data1: data1
    
	# This line includes tab.

data2: data2
)";

ryml::Tree tree = ryml::parse_in_place(yaml);

4:2: (20B): ERROR: could not find ':' colon after key
4:2:    # This line includes tab.  (size=26)
      ^~~~~~~~~~~~~~~~~~~~~~~~~  (cols 2-27)

Aborted (core dumped)

serbayozkan avatar Jul 08 '25 11:07 serbayozkan

Did you set the error handler?

The error message you see comes from the error handling code, which will subsequently result in a call to the error handler.

The default error handler calls abort; if you want a different behaviour you will need to opt in.

biojppm avatar Jul 10 '25 17:07 biojppm

Yes, I set it. Let me share the full application code. Is there any problem with setting error handler in this code?

void onErrorFunction(
    const char* /*msg*/, size_t /*msg_len*/, ryml::Location /*loc*/, void* /*user_data*/)
{
  throw std::runtime_error{"YAML parser error"};

  __builtin_unreachable();
}

  char yaml[] = R"(
data: [, 18 34]
)";

ryml::EventHandlerTree evtHandler;
ryml::Tree tree{ryml::Callbacks{nullptr, nullptr, nullptr, &onErrorFunction}};

ryml::ParserOptions opt;
opt.locations(true);
ryml::Parser parser = ryml::Parser{&evtHandler, std::move(opt)};

parse_in_place(&parser, ryml::to_substr(yaml), &tree);

// Output
2:8: (8B): ERROR: parse error
2:8: data: [, 18 34]  (size=15)
            ^~~~~~~~  (cols 8-16)

Aborted (core dumped)

serbayozkan avatar Jul 14 '25 18:07 serbayozkan

I was able to reproduce the SIGABRT; hold on while I investigate.

biojppm avatar Jul 14 '25 21:07 biojppm

Thanks for reporting!

There seems to be a problem: while parsing, the callbacks from the tree are overwritten with those from the event handler. It will be some time until I can look at this situation and fix it.

In the meantime, you have a workaround -- make sure that the event handler is constructed with the callbacks:

ryml::Callbacks callbacks{nullptr, nullptr, nullptr, &onErrorFunction};
ryml::EventHandlerTree evtHandler(callbacks); // pass callbacks here.

ryml::ParserOptions opts = ryml::ParserOptions{}.locations(true);
ryml::Parser parser = ryml::Parser{&evtHandler, opts};

ryml::Tree tree;
parse_in_place(&parser, ryml::to_substr(yaml), &tree); // tree will receive callbacks

biojppm avatar Jul 14 '25 21:07 biojppm

Constructing event handler with the callbacks solves the problem. Thanks for the workaround. Should I close this or do you prefer to keep it open for tracking?

serbayozkan avatar Jul 15 '25 11:07 serbayozkan

Please leave this open.

biojppm avatar Jul 15 '25 15:07 biojppm