shadowsocks-android icon indicating copy to clipboard operation
shadowsocks-android copied to clipboard

sslocal keeps crashing

Open Mygod opened this issue 6 months ago • 3 comments

Describe the bug

Keeps crashing after a while on Android 16. @zonyitoo Any ideas?

Crash dump: https://pastesio.com/tokay16-crashlog

Configuration Put an x inside the [ ] that applies.

  • [x] IPv4 server address
  • [ ] IPv6 server address
  • [x] Client IPv4 availability
  • [ ] Client IPv6 availability
  • Encrypt method:
  • Route
    • [ ] All
    • [ ] Bypass LAN
    • [ ] Bypass China
    • [ ] Bypass LAN & China
    • [ ] GFW List
    • [ ] China List
    • [ ] Custom rules
  • [ ] IPv6 route
  • [ ] Apps VPN mode
    • [ ] Bypass mode
  • Remote DNS: 8.8.8.8
  • [ ] DNS over UDP
  • Plugin configuration (if applicable): no plugin
  • [ ] Auto Connect
  • [ ] TCP Fast Open
  • If you're not using VPN mode, please supply more details here:

Additional context Add any other context about the problem here.

Mygod avatar Jun 22 '25 05:06 Mygod

From Gemini: After reviewing the code in conjunction with the original crash log, I can identify the specific location and the nature of the bug.

The Bug is Here:

The bug is a subtle but critical issue in the Drop implementation for UdpAssociation:

impl<W> Drop for UdpAssociation<W>
where
    W: UdpInboundWrite + Send + Sync + Unpin + 'static,
{
    fn drop(&mut self) {
        self.assoc_handle.abort();
    }
}

Explanation: Why This is a Bug

  1. What drop means: When an UdpAssociation is removed from the LruCache (because it expired), its drop method is called. This is the cleanup crew coming in.

  2. What .abort() does: The line self.assoc_handle.abort() sends a cancellation request to the asynchronous task that is running the dispatch_packet loop inside UdpAssociationContext. This is a very abrupt and immediate termination. The task will be cancelled at the next .await point it hits, regardless of what it's doing.

  3. The Race Condition: The dispatch_packet loop is a complex state machine that juggles multiple asynchronous operations:

    • receiver.recv()
    • receive_from_bypassed_opt()
    • receive_from_proxied_opt()
    • keepalive_interval.tick()
    • Multiple .await points inside dispatch_received_packet()

    When .abort() is called from another thread (the one running cleanup_expired), the dispatch_packet task can be terminated at any one of these .await points.

  4. The Crash Scenario: The original backtrace showed a panic happening inside the drop of UdpAssociationContext. This is the key. The UdpAssociationContext is the state that the aborted task owns. When the task is aborted, Rust still tries to clean up its owned data by dropping UdpAssociationContext.

    However, because the task was cancelled at an arbitrary point, the UdpAssociationContext can be in an inconsistent state. For example:

    • It might be holding a lock that never gets released.
    • It might have partially sent data to a channel.
    • It could be in the middle of re-establishing a socket.

    The most likely culprit is a resource that is being used by another part of the system when the drop occurs. When the drop implementation for UdpAssociationContext runs, it might try to perform an operation that is no longer valid because the task was killed mid-stride, leading to the panic.

    The provided Drop implementation for UdpAssociationContext is simple:

    impl<W> Drop for UdpAssociationContext<W>
    where
        W: UdpInboundWrite + Send + Sync + Unpin + 'static,
    {
        fn drop(&mut self) {
            debug!("udp association for {} is closed", self.peer_addr);
        }
    }
    

    While this specific drop implementation itself doesn't do much that could panic, the resources it owns (like bypassed_ipv4_socket, proxied_socket, and the keepalive_tx channel sender) will also be dropped. If one of these resources is in a bad state due to the .abort(), its own drop implementation could be the source of the panic.

How to Fix It

The standard way to fix this in Tokio is to avoid .abort() for graceful shutdown. Instead, you should use a cooperative cancellation mechanism.

  1. Introduce a "Shutdown" Channel: Add a channel (like a tokio::sync::watch or tokio::sync::oneshot channel) to the UdpAssociationContext.
  2. Modify the Drop Implementation: The drop implementation of UdpAssociation should send a "shutdown" signal on this new channel. It should not call .abort().
  3. Modify the dispatch_packet Loop: The tokio::select! loop in dispatch_packet should be modified to listen for a message on the shutdown channel.
  4. Graceful Exit: When the shutdown signal is received, the loop should break cleanly. This allows the task to finish its current operation, clean up its state properly, and exit gracefully.

This cooperative approach ensures that the UdpAssociationContext is always in a consistent state when it is eventually dropped, preventing the "panic during drop" that is causing your application to crash.

Mygod avatar Jun 22 '25 05:06 Mygod

      #08 pc 000000000119ed70  /data/app/~~YNtuU6BX-4Di9oI8Qa2KgQ==/com.github.shadowsocks-y0WpsbERN4j9350M4QW-lg==/lib/arm64/libsslocal.so (core::panicking::panic_in_cleanup::h38ce2cd36b69578e+20)
      #09 pc 0000000000645304  /data/app/~~YNtuU6BX-4Di9oI8Qa2KgQ==/com.github.shadowsocks-y0WpsbERN4j9350M4QW-lg==/lib/arm64/libsslocal.so (core::ptr::drop_in_place$LT$shadowsocks_service..local..net..udp..association..UdpAssociationContext$LT$shadowsocks_service..local..socks..server..socks5..udprelay..Socks5UdpInboundWriter$GT$$GT$::hc22e9602ebb94b88+408)

The Socks5UdpInboundWriter somehow triggered a panic.

zonyitoo avatar Jun 23 '25 06:06 zonyitoo

My phone updated to Android 16 today. And sslocal doesn't crash. Is this still reproducible with latest android updates?

AaronChen0 avatar Oct 28 '25 04:10 AaronChen0