can-utils icon indicating copy to clipboard operation
can-utils copied to clipboard

slcand keeps running when underlying USB dongle is detached

Open muggenhor opened this issue 9 years ago • 9 comments

When using slcand on a USB can dongle (providing a serial-over-USB device as /dev/ttyUSBx) and detaching that dongle slcand doesn't detect that event and keeps running.

As a way of detecting this I've replaced the sleep(1) call with select(readfds=[fd], timeout=1) and terminate when a read() call returns 0 (EOF). Considering that the underlying device is gone, resetting the line discipline and terminal properties isn't necessary anymore either.

Example solution:

/* The Big Loop */
while (slcand_running)
{
   /* Wait for one second or until the CAN dongle gets detached */
   struct pollfd fds = {
       .fd = fd,
       .events = 0 /* POLLHUP doesn't need to be selected as to-be-waited-for */,
   };
   const int retval = poll(&fds, 1, 1000);

    /* EOF on serial line: assume it's a USB serial device that got pulled out and terminate. */
   if (retval > 0
     && (fds.revents & POLLHUP))
   {
       syslog(LOG_NOTICE, "terminated on %s due to EOF", ttypath);
       exit(EXIT_SUCCESS);
   }
}

This works because the file descriptor's function-table gets replaced with hung_up_tty_fops as seen in https://github.com/torvalds/linux/blob/d1d3a0f7448fe038ce7e94e2c281dcd2f91b23c6/drivers/tty/tty_io.c#L722 . That always returns POLLHUP | POLLERR and the selected event bits (we don't care about any other), that's why we don't select them.

PS Shouldn't the global slcand_running be volatile?

muggenhor avatar Jan 14 '16 16:01 muggenhor

Generally a nice approach. Do you have an reference what happens to the incoming data stream, when this is 'attached' to a line discipline? I wonder if reading from the file descriptor 'fd' has any impact to the slcan input stream or if we can get the status of the file descriptor without reading singe bytes?!?

hartkopp avatar Jan 16 '16 18:01 hartkopp

I suppose just calling select might be enough, the extra read was just to confirm that there's really an EOF there.

Looking at [https://github.com/torvalds/linux/blob/d1d3a0f7448fe038ce7e94e2c281dcd2f91b23c6/drivers/tty/tty_io.c#L1072] it appears that read should fail with EIO when the line discipline is still attached (because slcan doesn't implement it, i.e. it's NULL).

muggenhor avatar Jan 18 '16 13:01 muggenhor

@hartkopp an alternative approach to finding the status of the descriptor without reading at all is by using poll. I'll update the example.

muggenhor avatar Jan 21 '16 13:01 muggenhor

Any update on this? This still poses as a issue for me

l0g1x avatar Oct 18 '16 20:10 l0g1x

Moreover, when I unplug USB device while sending CAN frames (huge bunch of CAN frames, so sending them requires polling), slcand goes to "uninterruptible sleep" state and I even cannot reboot Ubuntu from OS, only by pressing restart button.

eugesh avatar Mar 12 '20 19:03 eugesh

Did you apply the code which is suggested at the first post of this issue? I tried to create a huge bunch of CAN frames while unplugging the USB-serial adapter - and nothing crashed or freezed. I have a FTDI USB serial device here. Maybe this also has an impact on the problem.

hartkopp avatar Mar 21 '20 16:03 hartkopp

Yes dear, it seems like it helped. After applying the code from the first comment. Thank you! But I did not even reproduce the problem with "uninterrupted sleep" before applying it.

I also had a problem with vcan0 on some machines: CAN frame order confusion when sending many frames in flood traffic mode. But on other machines, everything is in order. But it is far away from this issue.

eugesh avatar Jun 26 '20 18:06 eugesh

I would love for this to be merged. It seems to solve my issue to allow me to create a more reliable system that automatically restarts on a connection loss. It seems to me like a clear issue that when the underlying serial device disappears that slcand does not react in any way.

Pleune avatar Feb 02 '21 21:02 Pleune

@Pleune I totally missed that issue. @muggenhor can you please generate a proper pull request for your patch? Thanks!

hartkopp avatar Feb 09 '21 15:02 hartkopp