jetfire icon indicating copy to clipboard operation
jetfire copied to clipboard

Thread-Safety Issues

Open adamkaplan opened this issue 9 years ago • 9 comments

The Jetfire API is not thread-safe, and it is not explicitly declared as such. Either add a note that instances of JFRWebSocket should not be accessed by multiple threads, or make the API thread-safe. My preference is for the later – putting the responsibility on the library, not the client.

Delegate Callbacks

Most of these occur on the private connection queue, not the specific delegate queue: https://github.com/acmacalister/jetfire/blob/master/JFRWebSocket.m#L330-331 https://github.com/acmacalister/jetfire/blob/master/JFRWebSocket.m#L452-453 https://github.com/acmacalister/jetfire/blob/master/JFRWebSocket.m#L459-460 https://github.com/acmacalister/jetfire/blob/master/JFRWebSocket.m#L465-466 https://github.com/acmacalister/jetfire/blob/master/JFRWebSocket.m#L492-493 https://github.com/acmacalister/jetfire/blob/master/JFRWebSocket.m#L531-532 https://github.com/acmacalister/jetfire/blob/master/JFRWebSocket.m#L539-540 https://github.com/acmacalister/jetfire/blob/master/JFRWebSocket.m#L553-554

Unsafe Setters

In a few places, there are checks like if (!self.property) self.property = X;. These are not thread-safe because the setters are not atomic. A common very difficult-to-find crash occurs when two threads wander into such blocks and invoke the runtime-generated setter at the same time. The result will be a -release race condition – both threads can release the original value, and leak one of the new values.

https://github.com/acmacalister/jetfire/blob/master/JFRWebSocket.m#L143-L145 https://github.com/acmacalister/jetfire/blob/master/JFRWebSocket.m#L625-626

Unsafe Mutable Access

This one is an edge case – addHeader:forKey: adds a value to the dictionary on an arbitrary client thread, while -connect triggers enumeration of the same collection on the private connection thread. The easiest way to fix this is to use an immutable copy of the collection internally.

Self-reference Cycles

More of a side-effect of multi-threading with blocks than a bug. Using self within a block creates a strong reference which has a few funky side effects. Particularly around deallocation behavior of the instance. For example, in dequeueWrite:, a strong reference to self is created and enqueued onto operation queue. This means that the instance will have a +1 retain count for each write operation and continue to live on – processing operations – after the client/app code has released it: https://github.com/acmacalister/jetfire/blob/master/JFRWebSocket.m#L630

Most use of the self reference within async blocks should use a weak reference. You can check to see if the weak reference is nil at the start of execution, and (usually) safely retain it for the duration of the block. This should happen anywhere dispatch_async is used with self. Much written on this

adamkaplan avatar Apr 13 '15 16:04 adamkaplan

I completely agree we should make the API thread-safe. I have an assumption that will probably clear up the outstanding issues of #9 and #13 as they appear to be some sort of race condition. I will try and get a PR together that will address this, unless you already have something. :smiley:

acmacalister avatar Apr 13 '15 20:04 acmacalister

Yeah, that will probably get the project closer to safety. I was testing the water with this issue before committing to code. I have already started three branches:

  1. Splitting out the response object (pretty basic, done)
  2. Adding quick-and-dirty Autobahn tests runner (in progress)
  3. Moving the response-specific parsing into the response object (moderate change but very scary)

1 is done. 3 is blocked by 2 – I would need to have confidence in the change. Some of that is likely to conflict hard with other changes. Probably should just be mindful of it and continue on the two issues and these branches.

adamkaplan avatar Apr 13 '15 21:04 adamkaplan

Most Cocoa APIs aren't thread-safe, and Apple's convention is that the docs only point out the ones that are safe, i.e. if the docs don't say it's thread-safe, it isn't. So JetFire is in compliance with that :)

My strong belief is that in most cases thread-safety should be added only at a level that requires it. There is a lot of overhead (performance as well as code size and testing) to making an API thread-safe, and if you use one thread-safe API inside another one you're often wasting the safety work that the inner one does. This is actually the same rationale that Apple uses for not making all their APIs thread-safe. (FWIW, I worked at Apple for many years and was involved in some Cocoa API designs.)

snej avatar Apr 13 '15 21:04 snej

True enough.

The points still stand though. Much of the thread-safety issues highlighted stem from the API internals improperly notifying delegates on the wrong queues. And although I do agree with not designing for pristine thread-safe API, there are some defensive coding practices that make it a non-issue without any complexity – like calling copy on a headers dictionary that is not expected to be changed often.

adamkaplan avatar Apr 13 '15 21:04 adamkaplan

@snej, those are some solid points. I guess I'm sort of conflicted. I agree, most Cocoa APIs aren't thread safe and that does add quite a bit of overhead for the API to have "complete thread safety". Our original goal when we set out to write this library was to provide a simple, concise code base for doing WebSocket's that passed the AutoBahn suite. Thread safety was never a major concern as I didn't real thing that many people would have multiple WebSocket connection open at once. Obviously, that was an incorrect assumption :smile:. I do agree with @adamkaplan though, there are definitely some changes we could make to avoid some of the race conditions are being reported now. My hope is that between #25, #26, etc we can get most of the major race conditions addressed without adding a ton of extra code.

acmacalister avatar Apr 14 '15 19:04 acmacalister

Seems we're all on the same page: no explicit thread-safe API, but promote defensive coding practices.

It helps a lot that the class already has 3 running queues. That fact is the only reason why I suggested thread-safety (but again, backing off to just "defensive programming" where it makes sense).

adamkaplan avatar Apr 14 '15 19:04 adamkaplan

I ran into a threading issue as well. In JFRWebSocket.m, - processHTTP:length:, the library is calling websocketDidConnect: on the delegate before isConnected is set to true, depending on thread scheduling.

I've been using a dependent library that tries to send bytes into the socket on that callback, but it first checks if the socket is connected.

aaronscherbing avatar Oct 21 '15 19:10 aaronscherbing

@aaronscherbing I think I found what is wrong. It should be resolved in master with the latest commit. Thanks for the report.

daltoniam avatar Oct 26 '15 17:10 daltoniam

Great! Thank you for looking into it.

aaronscherbing avatar Oct 26 '15 17:10 aaronscherbing