sentry-native
sentry-native copied to clipboard
Invalid characters in breadcrumbs or event messages lead to silently discarded events
Description
If hand-crafted stack traces contain some specific invalid characters, sentry-native will not complain and events will disappear into the ether during CURL transfer.
Another similar error happened with breadcrumbs, where I accidentally wrote completely invalid characters into the string.
I suggest improving write_json_str function inside sentry_json.c to handle all potential invalid characters. For example, replace '&' with "&", and replace invalid utf8 characters with debug hex. This would help identify more serious data corruption bugs which are currently not visible.
The compiler also complains of a buffer overflow regarding snprintf(buf, sizeof(buf), "\\u%04x", *ptr); inside write_json_str function: note: 'snprintf' output 11 bytes into a destination of size 10. It should probably be snprintf(buf, sizeof(buf), "\\u%02x", (int)*ptr & 0xff );
When does the problem happen
- [x] During build
- [ X] During run-time
- [ X] When capturing a hard crash
Environment
- OS: CentOS Linux 7
- Compiler: GCC 7.5.0, 64-bit
- CMake version and config: 3.19.5, SENTRY_BACKEND=breakpad
Steps To Reproduce
Invalid character in breadcrumbs:
std::string invalid = "crumb";
invalid.back() = (char)0xff; // forcefully corrupt the string with an invalid char
sentry_value_t crumb = sentry_value_new_breadcrumb("info", invalid.c_str());
sentry_add_breadcrumb(crumb);
// this will never make it to sentry backend, it silently disappears because of the invalid char
sentry_capture_event(sentry_value_new_message_event(SENTRY_LEVEL_INFO, "custom", "event");
Invalid character in custom stacktrace:
void addStackTraceToEvent(sentry_value_t evt, void* callstack, int num_frames)
{
sentry_value_t frames = sentry_value_new_list();
for (int i = num_frames - 1; i >= 0; --i)
{
StackFrame sf = symbolize_frame(callstack[i]);
char addr[64]; snprintf(addr, sizeof(addr), "0x%llx", (unsigned long long)sf.addr);
// forcefully corrupt the stacktrace:
sf.function[0] = '&'; // adding an ampersand is enough to make this event disappear
sentry_value_t frame = sentry_value_new_object();
sentry_value_set_by_key(frame, "function", sentry_value_new_string(sf.function));
sentry_value_set_by_key(frame, "filename", sentry_value_new_string(sf.filename));
sentry_value_set_by_key(frame, "lineno", sentry_value_new_int32(sf.lineno));
sentry_value_set_by_key(frame, "instruction_addr", sentry_value_new_string(addr));
sentry_value_append(frames, frame);
}
sentry_value_t stacktrace = sentry_value_new_object();
sentry_value_set_by_key(stacktrace, "frames", frames);
sentry_value_t thread = sentry_value_new_object();
sentry_value_set_by_key(thread, "stacktrace", stacktrace);
sentry_value_t values = sentry_value_new_list();
sentry_value_append(values, thread);
sentry_value_t threads = sentry_value_new_object();
sentry_value_set_by_key(threads, "values", values);
sentry_value_set_by_key(evt, "threads", threads);
}
sentry_value_t beforeSendEvent(sentry_value_t event, void* always_null, void* data)
{
void* callstack[128];
int num_frame = stack_trace(callstack, 128, 1);
addStackTraceToEvent(event, stacktrace, num_frames);
return event;
}
Log output
No error messages in the log. CURL POST is all OK. Event never appears in the server. Cannot share further details, since codebase is proprietary and running a self-hosted instance.
I have added my own workaround to sanitize all the strings passed into sentry, but this should definitely be fixed inside sentry to save other developers the hassle. It took me 2 weeks to figure out why the event reports were failing randomly.
This is my current workaround for reference:
static size_t string_copy(char* buffer, size_t offset, size_t max, const char* source)
{
for (const char* src = source; *src && offset < max; ++src)
buffer[offset++] = *src;
return offset;
}
// replaces all invalid characters to prevent sentry events from disappearing
static char* sanitized_string(const char* string, char* buffer, size_t max, bool allow_ampersand)
{
size_t out_len = 0;
char ch = 0;
for (const char* ptr = string; (ch = *ptr) != 0 && out_len < max; ++ptr)
{
if (!allow_ampersand && ch == '&' && strcmp(ptr, "&") != 0)
{
out_len = string_copy(buffer, out_len, max, "&");
}
else if ((unsigned char)ch > 127) // non-ascii character, disregard utf8 for now
{
char buf[12]; snprintf(buf, sizeof(buf), "\\u00%02x", ch & 0xff); // EDIT: using \\u00%02x instead
out_len = string_copy(buffer, out_len, max, buf);
}
else
{
buffer[out_len++] = ch;
}
}
buffer[out_len] = '\0';
return buffer;
}
static sentry_value_t sanitized_value(const char* s)
{
char buffer[1024];
return sentry_value_new_string(sanitized_string(s, buffer, sizeof(buffer), false));
}
This is interesting, however I don’t really understand what might be failing here.
- JSON does not mandate that you html-escape ampersands.
- However, JSON mandates that unicode escapes have 4 digits, your method only uses 2 digits.
Related, you use an explicit & 0xff on an unsigned char. I wonder why that would be needed? In general, we assume strings to be valid JSON.
In any case, you mentioned that you are running a self-hosted sentry. Which version is that?
Can you maybe provide an example envelope that I can try to reproduce this?
So I ran some tests with SaaS Sentry: Ampersands in values are supported just fine (tested with tags)

And Sentry does accept 2-digit unicode escapes, which means it is more accepting than for example firefox:

Thanks for pointing that out, I will use "\\u00%02x" instead :)
We are running Sentry 21.3.1 7e5ca7e self-hosted
Can you test invalid character 0xff inside breadcrumbs and ampersand inside stacktrace function name please? Those were the two instances where I noticed the failures, perhaps it does not happen with tags.
I will post an example envelope as soon as possible.
Here is an example envelope with an invalid character in breadcrumb and also an ampersand in the stack trace. It does not appear in Sentry Issues. When I sanitize the strings, the issue appears correctly in Sentry Issues.
This is the code I used to generate this error (I also had to disable my internal sanitization):
void reproduceInvalidReport(const std::string& ampersand_for_stacktrace)
{
using namespace mr::logging;
reportBreadcrumb(Level::Info, "breadcrumb with invalid char\xff");
reportEvent(Level::Error, "Error Event with Stack Trace");
}
This is my current workaround for reference:
static size_t string_copy(char* buffer, size_t offset, size_t max, const char* source) { for (const char* src = source; *src && offset < max; ++src) buffer[offset++] = *src; return offset; } // replaces all invalid characters to prevent sentry events from disappearing static char* sanitized_string(const char* string, char* buffer, size_t max, bool allow_ampersand) { size_t out_len = 0; char ch = 0; for (const char* ptr = string; (ch = *ptr) != 0 && out_len < max; ++ptr) { if (!allow_ampersand && ch == '&' && strcmp(ptr, "&") != 0) { out_len = string_copy(buffer, out_len, max, "&"); } else if ((unsigned char)ch > 127) // non-ascii character, disregard utf8 for now { char buf[12]; snprintf(buf, sizeof(buf), "\\u00%02x", ch & 0xff); // EDIT: using \\u00%02x instead out_len = string_copy(buffer, out_len, max, buf); } else { buffer[out_len++] = ch; } } buffer[out_len] = '\0'; return buffer; } static sentry_value_t sanitized_value(const char* s) { char buffer[1024]; return sentry_value_new_string(sanitized_string(s, buffer, sizeof(buffer), false)); }
This fix worked for me in a codebase of mine recently. Thank you!
Very unfortunate that Sentry doesn't handle this itslef.
This issue is stale. Closing. Please reopen if necessary.