core icon indicating copy to clipboard operation
core copied to clipboard

Improve Memory Safety

Open viferga opened this issue 3 years ago • 4 comments

🚀 Feature

The main idea of this is to produce a stable way to avoid and maintain memory leaks as low as possible. Even if we use options like Rust, most of our dependencies are C/C++ and this forces us to have a standard way of tracking, solving and maintaining memory leaks and undefined behaviors to the minimum.

Is your feature request related to a problem?

Yes, recently I have suffered with Python Loader in order to track some memory leaks in the Garbage Collector when refactoring to the new import system. Introducing new code is hard and error prone. Memory leaks are very hard to track, specially when related to Garbage Collectors or similar, where there is much more noise that hides the real problem. For example, a bad reference counter in Python can lead to a segmentation fault. Or a circular dependency can lead to memory leaks and eventually (specially in intensive contexts like FaaS) can generate a Out-Of-Memory Kill of the process.

Describe the solution you'd like

There are some solutions that I came up with, they are not mutually exclusive, and it is just an approximation, we can find others or improve these ones.

  1. Prepare a Docker image or some environment (maybe Guix?) building all runtimes with sanitizer and debug flags (for example Python allows to configure with debug and sanitizer options before build, https://devguide.python.org/clang/ or https://github.com/python/cpython/blob/d1b81574edd75e33ae85c525ac988ce772675a07/configure.ac#L3011 ), NetCore also needs to compile with those flags otherwise address sanitizer doesn't work well ( https://github.com/dotnet/runtime/blob/01b7e73cd378145264a7cb7a09365b41ed42b240/src/coreclr/enablesanitizers.sh ). Valgrind and Asan are mutually exclusive, so we should have this kind of tests duplicated, once to run with static analysis and another for running with dynamic analysis. Some Valgrind parts will need of suppression files to avoid false positives (specially with free lock data structures and atomics). This approach can be automated with very intensive testing (even fuzzers), which may be so hard to run (in terms of time, cpu and memory consumption) but very effective to detect a lot of errors.

  2. Move to Rust the implementation of the loaders, for example using PyO3 ( https://github.com/PyO3/pyo3 ). Rust can avoid many memory leaks and improve security. But it is not a silver bullet neither. Most projects like PyO3, Neon ( https://github.com/neon-bindings/neon ) and Rusty_v8 ( https://github.com/denoland/rusty_v8 ) assume that the unsafe calls to C/C++ are already safe because they trust the developers of Python, NodeJS and V8. This assumption can introduce bugs that escape to the borrow checker. Another drawback is that Rust does not interface well with C/C++, instead of that, rusty_v8 has to generate wrappers around V8 that expose it as C API, similar to NodeJS NAPI. This is acceptable but implies to maintain the interface. Probably there are other automated options nowadays ( https://rust-lang.github.io/rust-bindgen/cpp.html ) but it is something that must be taken into account.

  3. Follow the pattern of NodeJS Loader ( https://github.com/metacall/core/blob/a5b97437655d7186c87a1b7d535c423e3625b80e/source/loaders/node_loader/bootstrap/lib/bootstrap.js ), C# Loader ( https://github.com/metacall/core/tree/develop/source/loaders/cs_loader/netcore/source ) or TypeScript Loader ( https://github.com/metacall/core/blob/a5b97437655d7186c87a1b7d535c423e3625b80e/source/loaders/ts_loader/bootstrap/lib/bootstrap.ts ) and move all the code we can from C/C++ to Python, and bootstrap with higher level languages the loaders (this can still lead to leaks but it's usually much less than in C/C++). Some loaders can be completely written in the language itself, for example Crystal Loader (which is not implemented yet) can be done almost completely in Crystal ( https://github.com/metacall/core/blob/develop/source/loaders/cr_loader/crystal/cr_loader_impl.cr ).

viferga avatar May 13 '21 10:05 viferga

Here's few commits that prevent memory bugs derived from user input:

  • https://github.com/metacall/core/commit/717086d91d74425a14c4e6d7b38fd32037a96157
  • https://github.com/metacall/core/commit/a28f50bd89f35ad48014951e36cf804b25a9381f
  • https://github.com/metacall/core/commit/e7cd9feeb8b184acd4f1a90f0c8814a84e41be7c

The last one contains information for when we implement our own memory pool, in order to detect user after free safely, without accessing the memory already freed to detect it.

viferga avatar Jan 19 '22 08:01 viferga

I have improved a bit the point 1. It is a base, cs_loader has a lot of leaks but they are difficult to track properly unless we build NetCore with sanitizer flags.

I have added ./docker-compose.sh test which builds with sanitizer options and (soon) with the all the languages at once.

It reports at the end the number of leaks originated and the summary of the failed tests.

SUMMARY: AddressSanitizer: 832 byte(s) leaked in 12 allocation(s).
SUMMARY: AddressSanitizer: 208 byte(s) leaked in 3 allocation(s).
SUMMARY: AddressSanitizer: 208 byte(s) leaked in 3 allocation(s).
SUMMARY: AddressSanitizer: 791100 byte(s) leaked in 473 allocation(s).
SUMMARY: AddressSanitizer: 270 byte(s) leaked in 10 allocation(s).
SUMMARY: AddressSanitizer: 39624 byte(s) leaked in 67 allocation(s).
SUMMARY: AddressSanitizer: 524 byte(s) leaked in 10 allocation(s).
SUMMARY: AddressSanitizer: 10784 byte(s) leaked in 117 allocation(s).
SUMMARY: AddressSanitizer: 50664738 byte(s) leaked in 431 allocation(s).
SUMMARY: AddressSanitizer: 5198 byte(s) leaked in 94 allocation(s).
SUMMARY: AddressSanitizer: 40 byte(s) leaked in 1 allocation(s).
SUMMARY: AddressSanitizer: 49387 byte(s) leaked in 814 allocation(s).
SUMMARY: AddressSanitizer: 208 byte(s) leaked in 3 allocation(s).
SUMMARY: AddressSanitizer: 432 byte(s) leaked in 16 allocation(s).
SUMMARY: AddressSanitizer: 240 byte(s) leaked in 2 allocation(s).
SUMMARY: AddressSanitizer: 50575670 byte(s) leaked in 426 allocation(s).
SUMMARY: AddressSanitizer: 50659222 byte(s) leaked in 421 allocation(s).
SUMMARY: AddressSanitizer: 200 byte(s) leaked in 4 allocation(s).
SUMMARY: AddressSanitizer: 416 byte(s) leaked in 6 allocation(s).
SUMMARY: AddressSanitizer: 208 byte(s) leaked in 3 allocation(s).
SUMMARY: AddressSanitizer: 10784 byte(s) leaked in 117 allocation(s).
SUMMARY: AddressSanitizer: 10784 byte(s) leaked in 117 allocation(s).
SUMMARY: AddressSanitizer: 10784 byte(s) leaked in 117 allocation(s).
SUMMARY: AddressSanitizer: 7273 byte(s) leaked in 56 allocation(s).
SUMMARY: AddressSanitizer: 7190 byte(s) leaked in 54 allocation(s).
SUMMARY: AddressSanitizer: 2206 byte(s) leaked in 35 allocation(s).
SUMMARY: AddressSanitizer: 16978 byte(s) leaked in 89 allocation(s).
SUMMARY: AddressSanitizer: 74773310 byte(s) leaked in 582 allocation(s).
Number of leaks detected: 4083

Here's the complete log: output_sanitizer.txt

We should add this as a must for each commit (maybe with CI, but it will waste the free tier easily...) and all users should be using this to verify the behavior of their changes.

An addition we should do is provide also tests with valgrind as a complement to sanitizer.

viferga avatar Feb 11 '22 13:02 viferga

I have added leak suppression for DotNet Core ( https://github.com/metacall/core/commit/262ecd1f9d76f8d8e18422c8faa8b8b5d2dd43af ), the current amount of leaks is the following:

SUMMARY: AddressSanitizer: 791100 byte(s) leaked in 473 allocation(s).
SUMMARY: AddressSanitizer: 10856 byte(s) leaked in 122 allocation(s).
SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/local/metacall/source/loader/source/loader_impl.c:420 in loader_impl_get
SUMMARY: AddressSanitizer: 5198 byte(s) leaked in 94 allocation(s).
SUMMARY: AddressSanitizer: 1698 byte(s) leaked in 30 allocation(s).
SUMMARY: AddressSanitizer: 48884 byte(s) leaked in 805 allocation(s).
SUMMARY: AddressSanitizer: 200 byte(s) leaked in 4 allocation(s).
SUMMARY: AddressSanitizer: 96 byte(s) leaked in 6 allocation(s).
SUMMARY: AddressSanitizer: 10856 byte(s) leaked in 122 allocation(s).
SUMMARY: AddressSanitizer: 10856 byte(s) leaked in 122 allocation(s).
SUMMARY: AddressSanitizer: 10856 byte(s) leaked in 122 allocation(s).
SUMMARY: AddressSanitizer: 7345 byte(s) leaked in 61 allocation(s).
SUMMARY: AddressSanitizer: 2270 byte(s) leaked in 40 allocation(s).
SUMMARY: AddressSanitizer: 7262 byte(s) leaked in 59 allocation(s).
SUMMARY: AddressSanitizer: 2270 byte(s) leaked in 40 allocation(s).
Number of leaks detected: 2100

viferga avatar Feb 23 '22 14:02 viferga

Almost there: https://github.com/metacall/core/commit/69343205896f32fc41b492fc3c503c01d6109d74 https://github.com/metacall/core/commit/1ac6afd6d0d7bd6bb992cb115b0d59fe5d07de65

SUMMARY: AddressSanitizer: heap-use-after-free /usr/local/metacall/source/loader/source/loader_impl.c:1002 in loader_impl_handle_validate
SUMMARY: AddressSanitizer: 791100 byte(s) leaked in 473 allocation(s).
SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/local/metacall/source/loader/source/loader_impl.c:463 in loader_impl_get
SUMMARY: AddressSanitizer: 5198 byte(s) leaked in 94 allocation(s).
SUMMARY: AddressSanitizer: 200 byte(s) leaked in 4 allocation(s).
Number of leaks detected: 571

viferga avatar Mar 02 '22 14:03 viferga