xmtp-ios icon indicating copy to clipboard operation
xmtp-ios copied to clipboard

Fix retain cycle in Client

Open yewreeka opened this issue 4 months ago β€’ 2 comments

Introduction πŸ“Ÿ

Client kept a strong reference to Conversations, PrivatePreferences, and XMTPDebugInformation, all which kept a strong reference to Client. I also noticed that XmtpApiClient objects hang around because of ApiClientCache, but not sure if that's really an issue worth worrying about.

Purpose ℹ️

Fix retain cycle in Client

Testing πŸ§ͺ

Ran tests and checked the memory graph in the example app

yewreeka avatar Aug 14 '25 00:08 yewreeka

Break the Client retain cycle by making Conversations.client and PrivatePreferences.client unowned and by updating Client.debugInformation to initialize XMTPDebugInformation with environment.getHistorySyncUrl() in Client.swift

Switch Conversations.client and PrivatePreferences.client to unowned, and change XMTPDebugInformation to store a historySyncUrl and accept it via init(historySyncUrl:ffiClient:), with Client.debugInformation passing environment.getHistorySyncUrl(); whitespace-only edits in ApiClientCache.

πŸ“Where to Start

Start with the XMTPDebugInformation.init(historySyncUrl:ffiClient:) changes and usage in Client.debugInformation in Sources/XMTPiOS/Client.swift, then review Conversations.client in Sources/XMTPiOS/Conversations.swift and PrivatePreferences.client in Sources/XMTPiOS/PrivatePreferences.swift.

Changes since #550 opened

  • Changed Conversations class to use a weak reference to Client instead of unowned, added requireClient() helper method that throws ClientError.clientDeallocated when client is nil, and updated all methods including findGroup, findConversation, findConversationByTopic, findDmByInboxId, findDmByIdentity, listGroups, listDms, list, stream, findOrCreateDmWithIdentity, findOrCreateDm, newGroupInternalWithIdentities, newGroupInternal, newGroupOptimistic, and fromWelcome to call requireClient() and use the local client reference [a1d84d2]
  • Removed client references from PrivatePreferences actor by removing the unowned client property, changing the initializer to accept only ffiClient parameter, and updated streamConsent and streamPreferenceUpdates methods to construct callbacks without passing client [a1d84d2]
  • Removed client references from ConsentCallback and PreferenceCallback classes by removing the client stored property and client parameter from their initializers [a1d84d2]
  • Added clientDeallocated case to ClientError enum with description returning a specific error message, and updated Client.preferences property to call PrivatePreferences.init(ffiClient:) without passing client: self argument [a1d84d2]

Macroscope summarized 061222b.

macroscopeapp[bot] avatar Aug 14 '25 00:08 macroscopeapp[bot]

Claude finished @yewreeka's task β€”β€” View job


Code Review Complete

I've thoroughly analyzed the retain cycle fix in this PR. Here's my comprehensive feedback:


βœ… Overall Assessment

This PR successfully fixes the retain cycles between Client and its child objects. The approach is sound and the implementation is clean. There are a few minor concerns worth addressing, but the core fix is solid.


πŸ“‹ What Changed

1. XMTPDebugInformation - Removed Client Reference (XMTPDebugInformation.swift:11-16)

  • Before: Held a strong reference to Client
  • After: Only stores historySyncUrl: String and ffiClient
  • Impact: βœ… Clean solution, breaks cycle without compromising functionality
  • The uploadDebugInformation method at line 35 correctly uses the stored URL as a fallback

2. PrivatePreferences - Removed Client Reference (PrivatePreferences.swift:50-54)

  • Before: Had var client: Client creating a strong reference cycle
  • After: Removed client entirely, only keeps ffiClient
  • Impact: βœ… Excellent - the actor never actually used client, so removal is the cleanest solution
  • Updated initializer at Client.swift:196-198

3. Conversations - Changed to Weak Reference (Conversations.swift:126)

  • Before: Strong reference to Client (causing cycle)
  • After: private weak var client: Client?
  • Impact: βœ… Proper weak reference with safety checks
  • Added requireClient() helper method (lines 139-144) that throws ClientError.clientDeallocated
  • Made the client property private to prevent external misuse

4. ApiClientCache - Fixed NSMapTable Bug (Client.swift:158-177)

  • Before: Used NSMapTable.weakToWeakObjects() which had a critical bug (keys deallocated immediately)
  • After: Changed to regular Swift dictionary [String: XmtpApiClient]
  • Impact: βœ… Fixes the bug identified in macroscopeapp's review comment
  • Note: This means API clients are now cached with strong references (likely intentional for performance)

🎯 Strengths

1. Appropriate Use of Different Strategies The PR uses the right approach for each case:

  • Removed reference where not needed (PrivatePreferences, XMTPDebugInformation)
  • Weak reference where needed (Conversations)
  • Fixed caching bug (ApiClientCache)
  • Shows good architectural judgment

2. Proper Error Handling (Conversations.swift:139-144)

private func requireClient() throws -> Client {
    guard let client = client else {
        throw ClientError.clientDeallocated
    }
    return client
}

Excellent pattern - provides clear error feedback when client is deallocated.

3. Added clientDeallocated Error Case (Client.swift:10, 21-22) New error case with helpful description makes debugging easier.

4. Privacy Improvement (Conversations.swift:126) Making client private reduces the API surface and prevents external code from accessing a potentially nil weak reference.

5. Fixed Critical NSMapTable Bug The previous weakToWeakObjects() approach would cause cache misses on every lookup because the bridged NSString keys had no strong references and would immediately deallocate. Good catch!


⚠️ Concerns & Recommendations

1. Incomplete Client Validation in Conversations (Medium Priority)

Several methods in Conversations access ffiClient or ffiConversations without calling requireClient():

  • findGroup (line 158)
  • findMessage (line 240)
  • findEnrichedMessage (line 252)
  • deleteMessageLocally (line 263)
  • sync (line 267)
  • Many others...

Issue: Since client is now weak, theoretically these could execute after Client is deallocated. The ffiClient and ffiConversations are stored directly (not weak), so they might outlive the parent Client.

Recommendation:

  • Option A (Safest): Add try requireClient() at the start of all public methods
  • Option B: Document that ffiClient/ffiConversations have the same lifetime as Client and are safe to use
  • Option C: Store both as weak references as well

Example fix for Option A:

public func findGroup(groupId: String) throws -> Group? {
    let client = try requireClient()  // Add this
    do {
        return try Group(
            ffiGroup: ffiClient.conversation(conversationId: groupId.hexToData),
            client: client
        )
    } catch {
        return nil
    }
}

2. Stream Methods Need Weak Reference Handling (Medium Priority)

In Conversations.stream() (lines 385-461):

  • Line 391 checks if client is nil initially βœ…
  • But the callback closure captures client at lines 409, 416
  • Since client is weak, it could become nil during stream execution

Current code (line 396-424):

let conversationCallback = ConversationStreamCallback {
    conversation in
    Task {
        // ... 
        continuation.yield(
            Conversation.dm(
                conversation.dmFromFFI(client: client)  // ⚠️ client might be nil here
            )
        )
    }
}

Recommendation: Add nil check inside the callback:

let conversationCallback = ConversationStreamCallback {
    conversation in
    Task {
        guard let client = self.client else {
            // Client deallocated during stream - silently skip or log
            return
        }
        // ... rest of processing
    }
}

Same issue exists in streamAllMessages() at line 713-769.

3. ApiClientCache Memory Implications (Low Priority - Design Question)

The change from NSMapTable.weakToWeakObjects() to [String: XmtpApiClient] means:

  • Before: Attempted weak caching (but was broken)
  • After: Strong caching (API clients stay in memory)

Question: Is it intentional that XmtpApiClient instances now persist in the cache indefinitely?

  • This may be desirable for performance (connection reuse)
  • But it means they won't be deallocated even when no Client references them

If weak caching was intended, consider using NSMapTable.strongToWeakObjects() instead (as suggested in macroscopeapp's original review comment).


πŸ§ͺ Testing Recommendations

The PR mentions testing with the example app and memory graph analysis βœ…. Consider adding:

Unit Tests:

  1. Test that ClientError.clientDeallocated is thrown when accessing Conversations after Client deallocation
  2. Test stream behavior when Client is deallocated mid-stream
  3. Verify no retain cycles with a leak detector in tests

Example test structure:

func testClientDeallocation() async throws {
    var client: Client? = try await createTestClient()
    let conversations = client!.conversations
    
    // Deallocate client
    client = nil
    
    // Should throw clientDeallocated
    XCTAssertThrowsError(try conversations.sync()) { error in
        XCTAssertEqual(error as? ClientError, .clientDeallocated)
    }
}

πŸ“Š Summary

Aspect Rating Notes
Retain Cycle Fix βœ… Excellent All cycles properly broken
Architecture βœ… Good Right strategy for each case
Memory Safety ⚠️ Mostly Safe Needs weak checks in streams
Error Handling βœ… Good requireClient() pattern is clean
Code Quality βœ… High Clean, readable changes
Bug Fixes βœ… Excellent Fixed NSMapTable bug
Test Coverage ⚠️ Could Improve Add deallocation edge case tests

🎯 Action Items

Recommended before merge:

  1. ⚠️ Add nil checks in stream callback closures (stream(), streamAllMessages())
  2. ⚠️ Add requireClient() calls to public methods that use ffiClient/ffiConversations, or document why it's safe to skip

Nice to have: 3. πŸ’‘ Clarify/document whether strong caching of XmtpApiClient is intentional 4. πŸ’‘ Add unit tests for client deallocation scenarios 5. πŸ’‘ Resolve/respond to macroscopeapp's review comments (the NSMapTable one is addressed, the unowned one is outdated)


✨ Conclusion

This is a well-thought-out fix that appropriately addresses the retain cycle issue. The choice to:

  • Remove the Client reference from PrivatePreferences entirely (cleanest!)
  • Remove it from XMTPDebugInformation (also clean!)
  • Use weak with proper error handling for Conversations (necessary and well-done!)
  • Fix the NSMapTable bug in ApiClientCache (great catch!)

...shows excellent architectural judgment.

The main concerns are around ensuring weak reference handling is complete in all code paths, particularly in streaming scenarios. With those minor improvements, this PR will be production-ready.

Great work on identifying and fixing these cycles! πŸŽ‰


claude[bot] avatar Aug 18 '25 22:08 claude[bot]

thanks for finding this @jarodl . I added small change to remove the client reference from debuginformation. I also made the client reference in Conversations private, to minimize the chance of someone trying to acceess it when it's not available.

There is a small chance an integrator uses the SDK in an untended way holding a reference to a client.conversations() and then trying to use it when the client is not in memory. But hopefully suggested use cases in our docs will make that rare enough, so still think fixing memory leak possibilities here is the better way to go.

If you can suggest a follow up at some point to pass in some context provider instead of the client so we dont have the risk of someone trying to access the unowned client reference when its not available, that would be awesome. But this is a great improvement πŸ‘

I updated this to use a weak reference to avoid crashing in case it is used improperly, and to throw an error if the client has been deallocated. Let me know what you think! In Convos we're only keeping a max number of clients active at a time now, so this retain cycle is much more obvious now. Thank you!

yewreeka avatar Dec 17 '25 05:12 yewreeka