Fix cold start psiphon/cache interaction in smart-dialer
User issue while using smart dialer + psiphon
Going to background and resuming, when the android app is still running, and doing another check goes straight to Psiphon strategy and tests it successfully. I did find one usecase where it doesn't work exactly as expected. When ending up on a Psiphon fallback, storing that to outline strategy cache, and then doing a cold start of the app, it looks like Psiphon isn't quite ready at that point, it gets canceled and the whole DNS list starts to check. Then Psiphon wakes up, does it's test, and then does it again once all the DNS strategies fail. Sending a log with this cold start.
log: https://docs.google.com/document/d/1D4ZKq0cKcHT0jQnM5NqT9EWGy3paMe5s9Nx-YSFloJw/edit?tab=t.0
This eventually still starts working, but it requires re-running the entire search instead of using the cache.
The initial psiphon failure message is
Failed to start dialer: Psiphon: {PropagationChannelId: [REDACTED]} newPsiphonDialer failed: failed to start psiphon dialer: context canceled
Possibly an issue where the initial psiphon start context has been preserved from a previous startup but is obviously already cancelled.
This issue is after already applying the fix from https://github.com/Jigsaw-Code/outline-sdk/pull/523
Event sequence:
- This initial
findFallbackcall is failing after getting the old strategy from cache - We retry the proxyless strategies here which fail
- We make another
findFallbackcall that succeeds
The interior failure inside psiphon.go:Start here is just "context canceled", we don't have more log info on the exact context. There is both a startCtx and a dialerCtx there, but the dialerCtx should be freshly started from the background context on each call.
The ctx used in the first and second findFallback call is also the same.