ASAN leaks in test codes related to mock_utils::patch_and_return
Bug report
Required Info:
- Operating System:
- Ubuntu 20.04
- Installation type:
- source
- Version or commit hash:
- c83119c826c75fbaf6e741658927a953318832ad
- DDS implementation:
-
rmw_cyclone_dds
-
- Client library (if applicable):
- rcl
Steps to reproduce issue
Build with ASAN and run test_publisher__rmw_cyclonedds_cpp .
I got the following error.
24: ==16400==ERROR: LeakSanitizer: detected memory leaks
24:
24: Direct leak of 72 byte(s) in 1 object(s) allocated from:
24: #0 0x7f384f9f0bc8 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dbc8)
24: #1 0x7f384f7b72e6 in __default_allocate /src/ros2_rolling_asan/src/ros2/rcutils/src/allocator.c:37
24: #2 0x7f384f7d0fa7 in rcutils_string_map_init /src/ros2_rolling_asan/src/ros2/rcutils/src/string_map.c:62
24: #3 0x7f384f8832fd in rcl_resolve_name /src/ros2_rolling_asan/src/ros2/rcl/rcl/src/rcl/node_resolve_name.c:47
24: #4 0x7f384f883eeb in rcl_node_resolve_name /src/ros2_rolling_asan/src/ros2/rcl/rcl/src/rcl/node_resolve_name.c:152
24: #5 0x7f384f87d42a in rcl_publisher_init /src/ros2_rolling_asan/src/ros2/rcl/rcl/src/rcl/publisher.c:81
24: #6 0x55a3171caf58 in TestPublisherFixture__rmw_cyclonedds_cpp_test_mocks_fail_publisher_init_Test::TestBody() /src/ros2_rolling_asan/src/ros2/rcl/rcl/test/rcl/test_publisher.cpp:787
Expected behavior
no leak
Actual behavior
(may be) leak.
As this test case looks an error case test, so I'm sorry if this leak is an expected behavior.
Additional information
This happens at the following block.
// test_publisher.cpp
TEST_F(CLASSNAME(TestPublisherFixture, RMW_IMPLEMENTATION), test_mocks_fail_publisher_init) {
// snip
{
// Internal failure when fini rcutils_string_map returns error, targets substitution_map fini
auto mock = mocking_utils::patch_and_return(
"lib:rcl", rcutils_string_map_fini, RCUTILS_RET_ERROR);
ret = rcl_publisher_init(&publisher, this->node_ptr, ts, topic_name, &publisher_options);
EXPECT_EQ(RCL_RET_ERROR, ret) << rcl_get_error_string().str;
rcl_reset_error();
}
// snip
}
It looks:
-
rcutils_string_map_finiis patched and only returnsRCUTILS_RET_ERROR. - so, memories allocated by
rcutils_string_map_initare not deallocated.
By using inject_on_return this error message disappeared. Additionally, I think we can check the behavior when rcutils_string_map_fini returns an error.
auto mock = mocking_utils::inject_on_return(
"lib:rcl", rcutils_string_map_fini, RCUTILS_RET_ERROR);
I want to note that there are some leaks related to mock test not only in test_publisher but also in other tests.
If this fix is correct, I'm glad to fix all tihs kind of leaks and send PR.
By using
inject_on_returnthis error message disappeared. Additionally, I think we can check the behavior whenrcutils_string_map_finireturns an error.
For the most part, this would be the right way to go. There is one big problem with using inject_on_return; it is not supported on RHEL/CentOS. While that isn't currently one of our supported platforms, it is a platform we are trying to get working.
I'm not quite sure of the way forward. One thing we could try is to figure out how to skip these type of tests specifically on RHEL/CentOS. Then we could switch these to use inject_on_return, as they would just be skipped there. What do you think?
Thank you for comment.
Hmm, I rather want to run inject_on_return on RHEL/CentOS 😅
BTW, does inject_on_return work correctly on MacOS or Windows?
Hmm, I rather want to run
inject_on_returnon RHEL/CentOS sweat_smile
The other way forward is to dig into mimick and try and figure out a way to make inject_on_return work on RHEL. I don't know all of the details, but @hidmic does and may be able to provide you some information.
BTW, does
inject_on_returnwork correctly on MacOS or Windows?
As far as I know, yes, with some caveats. But again, @hidmic may be able to give you better information.
+1 👍 for using inject_on_return
As @clalancette, inject_on_return works everywhere but on RHEL/CentOS. For reasons that remain obscure to me, unlike other Linux distributions, binaries in CentOS yield .plt trampoline addresses in lieu of that of functions and somehow manage to not break ODR. An optimization? A global .got.plt table? Documentation down there is very scarce, so I wasn't able to find a way.
I was thinking a way of fixing inject_or_return, which currently does: stores the original function pointer, poisons the trampoline table with a function that calls the original function and then returns something else.
That doesn't work if the function pointer is pointing to the trampoline (you get infinite recursion).
On the other hand, that function could do one of the following:
- Read the current trampoline and get a pointer to the original function from it (I'm not completely sure if this is possible).
- Read the current trampoline, poison the table with a new function that: restores the old trampoline, calls the old function (now that the trampoline was restored), poisons the trampolines table again.
Note: this one is a problem if the function being mocked is recursive, but that's not usually the case for
inject_on_return(used mainly for *_fini functions). This implementation could be renamedinject_on_return_once, just in case.
Both ideas are tricky and I don't know much about mimick internals, but I think they should work.
I don't know much about it, so I'm sorry if I'm wrong.
A global .got.plt table?
I have heard it in the security context.
Is it related with RELRO(RELocation ReadOnly)?
According to the literature, .got.plt table is used when RELRO is not Full RELRO.
(But sorry, I can only give Japanese article: https://ipsj.ixsq.nii.ac.jp/ej/?action=repository_uri&item_id=187361&file_id=1&file_no=1)
As I don't use CentOS, so I build rolling on Fedora 31(https://docs.ros.org/en/rolling/Installation/Fedora-Development-Setup.html). We can see security setting by https://github.com/slimm609/checksec.sh.
Here is the result.
# ubuntu
$ ../test/checksec.sh/checksec --file=build-asan/rcl/test/test_publisher__rmw_cyclonedds_cpp
RELRO STACK CANARY NX PIE RPATH RUNPATH Symbols FORTIFY Fortified Fortifiable FILE
Full RELRO Canary found NX disabled PIE enabled No RPATH RW-RUNPATH 5872) Symbols No 0 10 build-asan/rcl/test/test_publisher__rmw_cyclonedds_cpp
# Fedora
# ./checksec.sh/checksec --file=build/rcl/test/test_publisher__rmw_cyclonedds_cpp
RELRO STACK CANARY NX PIE RPATH RUNPATH Symbols FORTIFY Fortified Fortifiable FILE
Partial RELRO No canary found NX disabled No PIE RW-RPATH No RUNPATH 5803) Symbols No 0 10 build/rcl/test/test_publisher__rmw_cyclonedds_cpp
In Fedora, Partial RELRO is used.
It looks we can use full RELRO by gcc -Wl,-z,norelr for example.
I hope we can run inject_on_return on Fedora 31 according to this gist.
This is the summary:
Summary
- By building the execution file as
Full RELROandPIE, the warningmocking_utils::inject_on_return() cannot forward calllooks disapear. UT runs and the behavior changes.- Ubuntu and CentOS have different default settings about these.
- As I don't know mimick or
mocking_utilswell, so I want someone to judge this change is right or not. - Additionally, if right, I'm happy if someone writes PR (I'm sorry, but we need to prioritize other tasks).
Thank you.
Thank you for the investigation @y-okumura-isp. I'll try to come back to this later on. If you're right, perhaps we can export these flags from mimick targets directly.