brpc
brpc copied to clipboard
Fix span lifecycle with smart pointers to prevent use-after-free in a…
What problem does this PR solve?
Issue Number: resolve #3068
Problem Summary:
Span lifecycle management defect in distributed storage system:
Problem Scenario:
- Server maintains a parent span for client Write requests
- Two child spans are created for append_entries RPCs to followers
- When the first follower responds, the parent span is prematurely destroyed along with all child spans
- When the second follower responds, it attempts to access the already-freed child span, causing use-after-free
Root Cause: a. Premature deallocation: Parent span destroyed while child spans still in use b. Dangling pointer: Response callback accesses freed span objects
Reference counting relationship
- Parent span holds strong references to child spans, while child spans hold weak references to parent span.
- RPC Done holds a strong reference to the parent span (i.e., server span), ensuring the reference count is released only after trace recording is completed when RPC Done executes.
- SpanContainer holds a strong reference to the parent span, ensuring the reference count is released only after the background collector thread completes dumping the trace to the database.
- Controller holds weak references to parent span/child spans, which does not affect the lifecycle of related spans while avoiding access to dangling pointers.
graph LR
subgraph "Parent Span (WriteChunk)"
PS["Parent Span<br/>ref_count = 2<br/>trace_id: 12345<br/>span_id: 67890"]
end
subgraph "强引用持有者 (shared_ptr)"
RD["RPC Done Callback<br/>SendRpcResponse<br/>shared_ptr<Span>"]
SC["SpanContainer<br/>后台collector<br/>shared_ptr<Span>"]
end
subgraph "弱引用持有者 (weak_ptr)"
PC["Controller<br/>_span<br/>weak_ptr<Span>"]
CS1["Child Span 1<br/>_local_parent<br/>weak_ptr<Span>"]
CS2["Child Span 2<br/>_local_parent<br/>weak_ptr<Span>"]
end
subgraph "Child Spans"
CS1_DETAIL["Child Span 1<br/>ref_count = 1<br/>append_entries to Follower1<br/>trace_id: 12345<br/>span_id: 67891"]
CS2_DETAIL["Child Span 2<br/>ref_count = 1<br/>append_entries to Follower2<br/>trace_id: 12345<br/>span_id: 67892"]
end
%% Parent Span的强引用
RD -->|"+1 ref_count"| PS
SC -->|"+1 ref_count"| PS
%% Parent Span的弱引用
PC -.->|"不增加ref_count"| PS
CS1 -.->|"不增加ref_count"| PS
CS2 -.->|"不增加ref_count"| PS
%% Parent Span持有Child Spans
PS -->|"_client_list<br/>shared_ptr<br/>+1 ref_count"| CS1_DETAIL
PS -->|"_client_list<br/>shared_ptr<br/>+1 ref_count"| CS2_DETAIL
%% Child Spans的弱引用
CS1_DETAIL -.->|"_local_parent<br/>weak_ptr"| PS
CS2_DETAIL -.->|"_local_parent<br/>weak_ptr"| PS
%% Controller对Child Spans的弱引用
C1["Controller 1<br/>weak_ptr"] -.-> CS1_DETAIL
C2["Controller 2<br/>weak_ptr"] -.-> CS2_DETAIL
%% 样式
classDef parentSpan fill:#e1f5fe,stroke:#01579b,stroke-width:4px
classDef childSpan fill:#f3e5f5,stroke:#4a148c,stroke-width:3px
classDef strongRef fill:#e8f5e8,stroke:#2e7d32,stroke-width:3px
classDef weakRef fill:#fff3e0,stroke:#e65100,stroke-width:2px
class PS parentSpan
class CS1_DETAIL,CS2_DETAIL childSpan
class RD,SC strongRef
class PC,C1,C2 weakRef
What is changed and the side effects?
Changed:
Side effects: NO
-
Performance effects: NO
-
Breaking backward compatibility: NO
Check List:
- Please make sure your changes are compilable.
- When providing us with a new feature, it is best to add related tests.
- Please follow Contributor Covenant Code of Conduct.