rt-thread icon indicating copy to clipboard operation
rt-thread copied to clipboard

[Bug] Recent object name change can cause object shadowing and logic issues 最近的对象名称变更可能导致物体阴影和逻辑问题

Open h0bbl3s opened this issue 1 month ago • 3 comments

RT-Thread Version

5.2.2

Hardware Type/Architectures

raspberry-pico

Develop Toolchain

GCC

Describe the bug

This is in reference to #10943


Description

The recent merge replaced RT_ASSERT and rt_memcpy with rt_strncpy in object.c to handle name overflows. While this prevents buffer overflows and boot loops, it introduces a critical logical vulnerability in the kernel: Object Shadowing.

Because the RT-Thread Object Container uses a LIFO (Last-In, First-Out) insertion policy, silently truncating a name causes new objects to "shadow" (hide) previously created objects with the same prefix. This leads to a situation where looking up an object by name returns the newest object, not necessarily the intended one.

A scan of possible name collisions on the repository finds a number of bsp and sdk files contain long object names that might have latent issues the new rt_strncpy method could obscure. There are also several UTEST tests that will now behave strangely if the RT_NAME_MAX setting is not high enough. Previously any misconfiguration would fail on the RT_ASSERT and be easy to debug, now the effects could be hard to predict or reproduce.

Testing

I wrote a small application that creates two semaphores with long names and then attempts to reach them by name. This was then tested on a raspberry pi pico 2 using the raspberry-pico/RP2350 bsp against the old code with RT_ASSERT and the recently merged code using rt_strncpy.

Test Code

#define TEST_NAME_1 "motor_controller_left"  
#define TEST_NAME_2 "motor_controller_right" 

void bug_demonstration(void)
{
    rt_kprintf("\n--- STARTING TRUNCATION TEST ---\n");

    /* 1. Create the "Left" Motor Semaphore */
    rt_sem_t sem_left = rt_sem_create(TEST_NAME_1, 0, RT_IPC_FLAG_FIFO);
    if (!sem_left) {
        rt_kprintf("Failed to create Left semaphore\n");
        return;
    }

    /* Cast to (struct rt_object *) to reach the 'name' member */
    rt_kprintf("Created: %s (Handle: %p)\n", ((struct rt_object *)sem_left)->name, sem_left);

    /* 2. Create the "Right" Motor Semaphore */
    rt_sem_t sem_right = rt_sem_create(TEST_NAME_2, 0, RT_IPC_FLAG_FIFO);
    if (!sem_right) {
        rt_kprintf("Failed to create Right semaphore\n");
        return;
    }
    rt_kprintf("Created: %s (Handle: %p)\n", ((struct rt_object *)sem_right)->name, sem_right);

    /* 3. THE BUG: Try to find the "Left" motor (The first one created) */
    rt_kprintf("\n--- SEARCHING FOR LEFT MOTOR ---\n");
    rt_object_t found_obj = rt_object_find(TEST_NAME_1, RT_Object_Class_Semaphore);
    
    if (found_obj == RT_NULL) {
        rt_kprintf("[!] Error: Could not find object by name.\n");
    } 
    else if (found_obj == (rt_object_t)sem_left) {
        rt_kprintf("[SUCCESS] System correctly identified the LEFT Motor.\n");
    } 
    else if (found_obj == (rt_object_t)sem_right) {
        rt_kprintf("\n[!!! CRITICAL FAILURE !!!]\n");
        rt_kprintf("Object Shadowing Detected!\n");
        rt_kprintf("We asked for: %s\n", TEST_NAME_1);
        rt_kprintf("We received Handle: %p (This is the RIGHT motor!)\n", found_obj);
        rt_kprintf("The Left Motor is now unreachable by name.\n");
    }
}

/* This macro MUST be outside of any function (Global Scope) */
MSH_CMD_EXPORT(bug_demonstration, Test object name truncation);

Results

With the old code the system crashed on the RT_ASSERT as expected, which is predictable and easy to debug. I could simply inspect my memory needs and increase the RT_NAME_MAX if safe or adjust my names to correct this.

Before Merge

msh >bug_demonstration

--- STARTING TRUNCATION TEST ---
[E/kernel.obj] Object name motor_controller_left exceeds RT_NAME_MAX=12, consider increasing RT_NAME_MAX.
(obj_name_len <= RT_NAME_MAX - 1) assertion failed at function:rt_object_allocate, line number:518 
[W/kernel.service] rt_hw_backtrace_frame_unwind is not implemented
please use: addr2line -e rtthread.elf -a -f
 0x1001ac98[W/kernel.service] rt_hw_backtrace_frame_unwind is not implemented

With the new object.c code, the test completes, but verifies that one of the objects is shadowed by the other. The first object is no longer reachable by name. Listing semaphores confirms there are two with identical names.

After Merge

msh >bug_demonstration

--- STARTING TRUNCATION TEST ---
[E/kernel.obj] Object name motor_controller_left exceeds RT_NAME_MAX=12, consider increasing RT_NAME_MAX.
Created: motor_contr (Handle: 0x20010aa8)
[E/kernel.obj] Object name motor_controller_right exceeds RT_NAME_MAX=12, consider increasing RT_NAME_MAX.
Created: motor_contr (Handle: 0x20010ae8)

--- SEARCHING FOR LEFT MOTOR ---

[!!! CRITICAL FAILURE !!!]
Object Shadowing Detected!
We asked for: motor_controller_left
We received Handle: 0x20010ae8 (This is the RIGHT motor!)
The Left Motor is now unreachable by name.

msh >list sem
semaphore    v   suspend thread
------------ --- --------------
motor_contr  000 0
motor_contr  000 0
shrx         000 0
stimer       000 1:timer
msh >

Conclusions

The related merge should be reverted until further discussion takes place. I have read through the conversations regarding this issue, and there were several good ideas, but they would all require a deal of work, thought, and testing. The original RT_ASSERT behavior, while annoying in production, is the only safe default until then. It forces the developer to acknowledge and fix the configuration mismatch (increasing RT_NAME_MAX or shortening the name) rather than letting the Kernel silently make unsafe guesses about object identity.

Why a Revert is Necessary

You might argue that code breaking "in the wild" is unlikely because existing code already passed the assertions. However, this change creates a problem waiting to happen for future development:

  • Loss of Determinism: An RTOS must be deterministic. If I request an object named "A", I must get "A" or an error. Getting "B" (because "A" was shadowed) violates the fundamental contract of the OS.

  • Invisible Failure: If a developer mistakenly uses a long name, the new code hides the error. The system appears to work but is internally corrupted (Object Shadowing). This turns a simple syntax error into a complex logic issue that may not manifest until the device is in production.

h0bbl3s avatar Nov 21 '25 04:11 h0bbl3s

This issue may require adding a new flag to bind with the object, enabling differentiation of objects exceeding RT_NAME_MAX.

Rbb666 avatar Nov 23 '25 03:11 Rbb666

Since the kernel does not check whether there are two objects sharing a same name, issues like this one cannot be solved properly in all scenarios even if there is an assertion.

My suggestion is do not access objects by name only if you can make sure the object's name is unique. Even if you initialized or created an object with a name that is unique as far as you know, how do you know there is not another object sharing that name in other libraries? Yes you can read the code but it's too troublesome if there are a lot of libraries and in some cases we don't have the source code.

Put an assertion there could be useful in you scenario but it break the compatibility for old codes so it's not what we truly need. Maybe we need more flags or APIs to tell the kernel to make sure which object's name shall be unique or which object can be anonymous. But it seems that it's a challenge to solve this issue perfectly and cause no damage to the compatibilities.

milo-9 avatar Nov 26 '25 03:11 milo-9

Since the kernel does not check whether there are two objects sharing a same name

That is a good observation. I was looking through device.c and noticed that rt_device_register already enforces uniqueness by calling rt_device_find and returning an error if a device with the name exists instead of calling rt_object_init. However this check can incorrectly fail to return an error now because truncation of an overly long name happens after this check occurs.

Perhaps for now we could add a Kconfig option such as RT_USING_STRICT_NAME_CHECKS that would enable assertion checks in rt_object_init against existing names for uniqueness. This could also include a check using RT_NAME_MAX to ensure that even with truncation, the resulting name will not shadow an existing object. This way we can ensure future code can meet this standard without breaking any current code.

My thinking for a long term solution is that rt_object_init return type should be changed from void type to rt_err_t type, and have built-in error checking for already existing names or other issues. This would require changes to several other core kernel files and most likely need to be done during a major version number change.

h0bbl3s avatar Dec 02 '25 19:12 h0bbl3s