serialport-rs icon indicating copy to clipboard operation
serialport-rs copied to clipboard

Overlapped io

Open DoumanAsh opened this issue 2 years ago • 17 comments

Subj This is a similar request to original repo https://gitlab.com/susurrus/serialport-rs/-/merge_requests/91

I'll need to do some testing myself, but overall it is the same code as in that PR with just some minor improvements

DoumanAsh avatar Jun 21 '22 09:06 DoumanAsh

The message on the original MergeRequest.

I used this library on a Windows machine and found some strange behavior. My program runs two threads and each thread runs port.read() and port.write() simultaneously. (using port.try_clone()) Then port.read() sometimes has an extremely delayed response (about 3s~5s). Windows seems to not be able to read the same port simultaneously in other threads even if those are created by DuplicateHandle(). So I supported OVERLAPPED IO. https://docs.microsoft.com/en-us/windows/win32/sync/synchronization-and-overlapped-input-and-output With this PR, It turned to be working well on my project.

smihica avatar Jun 21 '22 09:06 smihica

Thanks for the PR. I don't use Windows but I will get a colleague to verify this as soon as possible so we can get it merged!

jessebraham avatar Jun 28 '22 15:06 jessebraham

I think switching to overlapped io on Windows is the right choice, but we need to be careful about (and probably expose) the COMMTIMEOUTs. It's confusing and painful to get the expected system behavior. For reference see my own struggles with tokio-serial:

https://github.com/berkowski/tokio-serial/issues/55

And also what may be the only correct description of COMMTIMEOUTs that exists:

https://gitlab.com/susurrus/serialport-rs/-/merge_requests/78#note_343695538

It's likely impossible to use a single timeout value that meets many use cases, so a think that we need:

  • A default timeout behavior that works in the 'least surprising' way
  • A way to change the timeouts to support better latency/filling the buffer/etc.

mlsvrts avatar Jun 28 '22 15:06 mlsvrts

Thanks for the PR. I don't use Windows but I will get a colleague to verify this as soon as possible so we can get it merged!

Ah thanks, I'm not sure I'm able to do comprehensive tests myself so it would be good to verify it by someone who is more well versed

DoumanAsh avatar Jun 28 '22 16:06 DoumanAsh

@mlsvrts I added set_timeouts native method that allows to set all timeouts, not only read timeout for your use case

DoumanAsh avatar Jul 01 '22 01:07 DoumanAsh

I'm not sure if this is expected or not, but when I use read_all() on a stream with this patch it panics:

thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Error { kind: UnexpectedEof, message: "failed to fill whole buffer" }', src\main.rs:21:44
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library\std\src\panicking.rs:584
   1: core::panicking::panic_fmt
             at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library\core\src\panicking.rs:142
   2: core::result::unwrap_failed
             at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library\core\src\result.rs:1785
   3: enum$<core::result::Result<tuple$<>,std::io::error::Error>, 1, 18446744073709551615, Err>::unwrap<tuple$<>,std::io::error::Error>
             at /rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc\library\core\src\result.rs:1078
   4: mdi_tools::main::closure$0
             at .\src\main.rs:21
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

The error message of "failed to fill whole buffer" is confusing, since the purpose of .read_all() is to repetedly call .read() until the buffer is full. The code that's failing is:

    std::thread::spawn(move || {
        let mut buf = [0u8; 3];
        loop {
            println!("Reading data");
            read_half.read_exact(&mut buf).unwrap();
            println!("Got data: {:02x} {:02x} {:02x}", buf[0], buf[1], buf[2]);
        }
    });

xobs avatar Jul 13 '22 00:07 xobs

@xobs Read::read_exact default implementation fails if it cannot fill buffer completely If you set zero timeout, Read::read returns 0 when there is nothing to read. If you want to make sure to wait for data, you have to set timeout to wait for read to complete

Actually looking at code base, it doesn't seem to set FNDELAY on port there so maybe it should block indefinitely too (the problem though that if you pass too big of buffer, it will block forever until it fills buffer)

DoumanAsh avatar Jul 13 '22 01:07 DoumanAsh

My mistake, thanks. In that case, I suppose this patch is an improvement on the situation, since I get the error with this patchset. However, it doesn't seem to actually solve the problem.

If I set the timeout to something large, then the above code gets blocked at Reading data and never fnishes. Even if buf is a [0u8; 1].

However, if I open Tera Term and connect to that serial port after my Rust program closes, the buffer drains into the terminal window and I get all of the data that should have gone to my program.

This is in contrast to the current implementation which will just block forever.

xobs avatar Jul 13 '22 01:07 xobs

The patch goal is to enable proper split on windows. Without overlapping IO, cloning serial port would result in inability efficiently write/read from different threads

I don't exactly follow what is the issue you're encountering, API should block only as long as it cannot fill buffer, so if doesn't fill even [u8; 1] I assume there is no data 🤔

DoumanAsh avatar Jul 13 '22 01:07 DoumanAsh

My mistake -- the issue in my code was that on a Raspberry Pi Pico they don't send data until the DTR bit is asserted. Adding port.write_data_terminal_ready(true)?; solved my problem.

I suppose then the only question is whether it should be nonblocking by default. Structs like TcpStream are nonblocking, and will wait until data is available before returning. Aside from that, this patchset is now working for me.

xobs avatar Jul 13 '22 01:07 xobs

@xobs TcpStream by default is blocking actually AFAIK serial port behavior on windows is a bit different in a sense that it is always tries to fill buffer that you give it, instead of returning any data available. Unfortunately I do not know if it can be fixed somehow other than setting timeout

DoumanAsh avatar Jul 13 '22 01:07 DoumanAsh

@jessebraham Hey, just wanted to remind you about this PR I'm not in hurry though I just use my github repo for now

DoumanAsh avatar Aug 03 '22 05:08 DoumanAsh

Sorry, I've had some vacation time lately (and have also just been pretty busy in general) and sort of lost track of this conversation! I'll try to get to it this week.

jessebraham avatar Aug 09 '22 15:08 jessebraham

@xobs TcpStream by default is blocking actually AFAIK serial port behavior on windows is a bit different in a sense that it is always tries to fill buffer that you give it, instead of returning any data available. Unfortunately I do not know if it can be fixed somehow other than setting timeout

I think it would be possible to simulate blocking, but you'd probably need to use IOCP (I/O completion ports) and a thread which would fill a channel/queue of data from the serial port. You'd wait on this channel to be filled in order to simulate blocking reads.

This is because, as far as I understand things, with Windows overlapped I/O you're expected to give it a pool of threads and a handle to watch for I/O-- those threads are woken up and handed data once available, so it's pretty fundamentally different than typical non-blocking I/O on Linux (though io_uring on newer kernels provides similar features).

I guess the key point @DoumanAsh mentioned is that GetOverlappedResult isn't going signal completion until the whole buffer is filled or a timeout is reached?

If doing this "simulated blocking" then you could always set a timeout for the completion port and configure a small buffer size, re-queuing the read in the event of a timeout.

silverjam avatar Sep 19 '22 22:09 silverjam

@jessebraham kindly remind you about PR 😄

DoumanAsh avatar Oct 21 '22 02:10 DoumanAsh

@jessebraham

Okay, I've done some testing with this on a Windows 11 box with an FTDI USB->serial adapter and things seem to be working well. (See #69 for the loopback test I was running).

My only remaining note here is to make sure we version control this as a breaking change, as some code that worked before will stop working after this is merged (I've verified this can happen with the timeout behavior, but there may be other subtly broken scenarios).

Other than that, it looks good to me! Thanks @DoumanAsh!

mlsvrts avatar Nov 07 '22 00:11 mlsvrts

I rebased this PR. I would hope that it eventually gets merged, but if you do not need it please let me know I'll just close it

DoumanAsh avatar Feb 24 '23 07:02 DoumanAsh