thirtyfour icon indicating copy to clipboard operation
thirtyfour copied to clipboard

"WebDriver didn't wasn't quit properly" on main, but 0.35 works

Open BlinkyStitt opened this issue 10 months ago • 2 comments

I have some code working with thirtyfour v0.35.

I wanted to make some modifications to the repo and so I forked main, but then my app started behaving strangely. It doesn't ever exit. When I press [ctrl+c], I detect that and call web_driver.quit().await.ok(), but on main that throws this error now:

thread 'tokio-runtime-worker' panicked at /Users/bryan/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.43.0/src/runtime/time/entry.rs:568:9:
A Tokio 1.x context was found, but it is being shutdown.
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I think the changes to change how drop worked have created other bugs.

I think this is the commit that breaks things: https://github.com/Vrtgs/thirtyfour/commit/6c6794185f9d99765ff67570a7b5f0f51a4dca6d

I reverted that in a local checkout and now the app at least exits instead of hangs. but it still does say "WebDriver didn't wasn't quit properly"

Calling quit does give an error. That is probably related:

Error: The WebDriver request returned an error: error sending request for url (http://localhost:9515/session/5062df7f7afbe1e4ee482b249635c133)

BlinkyStitt avatar Feb 02 '25 22:02 BlinkyStitt

will look into this, the new drop logic is a bit messy

vrtgs avatar Feb 03 '25 07:02 vrtgs

I think the problem might be related to having a chrome devtools object. That clones the session handle, so maybe the drop isn't running when expected?

I'll try manually dropping before calling quit.

BlinkyStitt avatar Feb 05 '25 06:02 BlinkyStitt

will look into this, the new drop logic is a bit messy

Notice the same thing.

The correct way to do "async drop" is to consider using the blocking mode of reqwest.

wtdcode avatar Jul 05 '25 16:07 wtdcode

will look into this, the new drop logic is a bit messy

Notice the same thing.

The correct way to do "async drop" is to consider using the blocking mode of reqwest.

I disagree. Read the reqwest source code. The blocking mode just starts an async runtime and calls block_on() internally.

stevepryde avatar Jul 05 '25 23:07 stevepryde

since this works in 0.35 I think this was due to previous versions basically deadlocking the process by using a request client that is still tied to the previous runtime, but I can't seem to reproduce this issue

vrtgs avatar Jul 06 '25 05:07 vrtgs

Ah, I remember that. It only blocked on the single threaded runtime. If you have more than one thread it would succeed. Is that the same issue?

stevepryde avatar Jul 06 '25 05:07 stevepryde

will look into this, the new drop logic is a bit messy

Notice the same thing. The correct way to do "async drop" is to consider using the blocking mode of reqwest.

I disagree. Read the reqwest source code. The blocking mode just starts an async runtime and calls block_on() internally.

That's still better than current solution. Generally, I don't think it is wise to send a request in Drop. My situation is even worse due to this design because the webdriver might exit before Drop and thus the Drop will stuck forever.

wtdcode avatar Jul 06 '25 06:07 wtdcode

@Vrtgs @stevepryde Reproduction here: https://github.com/wtdcode/thirtyfour_repro

wtdcode avatar Jul 06 '25 06:07 wtdcode

problem tracked, since you kill the webdriver the drop logic fails, and then falls into infinite tail recursion, fix is just to avoid dropping the session handle once again

please dont kill the webdriver BEFORE quitting

vrtgs avatar Jul 06 '25 07:07 vrtgs

problem tracked, since you kill the webdriver the drop logic fails, and then falls into infinite tail recursion, fix is just to avoid dropping the session handle once again

Cool, thanks for the quick fix.

please dont kill the webdriver BEFORE quitting

Again, I do think it is not right to make this assumptions (and thus making a request in Drop!). It does not look right to me that a request must be made to "drop" a struct since you even don't know that really works by propagating the errors.

wtdcode avatar Jul 06 '25 08:07 wtdcode

and thus making a request in Drop!

The only way to close the browser window is by making a request to the WebDriver server. Are you saying the library shouldn't attempt to close the browser window on Drop ? It's a highly requested feature.

Async Drop is not officially supported by Rust (yet? ever?) so any logic to support this behaviour is going to require some "creative solutions", shall we say.

Do you have another suggestion?

stevepryde avatar Jul 06 '25 08:07 stevepryde

and thus making a request in Drop!

The only way to close the browser window is by making a request to the WebDriver server. Are you saying the library shouldn't attempt to close the browser window on Drop ? It's a highly requested feature.

I'm exactly reading related sources, as I'm also annoyed by the absence of AsyncDrop for a while.

Async Drop is not officially supported by Rust (yet? ever?) so any logic to support this behaviour is going to require some "creative solutions", shall we say.

Looks like it is promising to be stabilized in near future: https://github.com/rust-lang/rust/issues/126482

Do you have another suggestion?

So from my top to low recommendation of the workarounds:

  1. Have a centralized background manager of all sessions.

When a session needs to be dropped, it can send a message via channels in blocking context to the manager and the manager could close the session. Since we are sending the message within the process, it can almost never fail and we can somehow propogate the errors.

We have implemented this pattern in: https://github.com/wtdcode/mdbx-remote/blob/master/crates/mdbx-remote/src/txn_manager.rs#L16 for your reference. I also saw some other crates using this to workaround.

But the drawback is that we have a background task running but users are not aware. This might be surprising to users.

  1. Use the experimental std::future::AsyncDrop

This is less preferred but I indeed saw some crates relying on this.

What do you think?


UPDATE1: Some further reading makes me think that AsyncDrop is not that promising for stabilization =(.

wtdcode avatar Jul 06 '25 08:07 wtdcode

Again, I do think it is not right to make this assumptions (and thus making a request in Drop!). It does not look right to me that a request must be made to "drop" a struct since you even don't know that really works by propagating the errors.

it is a very bad assumption, I agree, wan't an assumption I made, the bug comes from me doing a recursive drop that only terminates when the drop succeeds

vrtgs avatar Jul 06 '25 09:07 vrtgs

  1. Have a centralized background manager of all sessions.

When a session needs to be dropped, it can send a message via channels in blocking context to the manager and the manager could close the session. Since we are sending the message within the process, it can almost never fail and we can somehow propogate the errors.

We have implemented this pattern in: https://github.com/wtdcode/mdbx-remote/blob/master/crates/mdbx-remote/src/txn_manager.rs#L16 for your reference. I also saw some other crates using this to workaround.

But the drawback is that we have a background task running but users are not aware. This might be surprising to users.

I mean thats more or less kind equivlent to spawning a task in a global runtime (current workaround)

vrtgs avatar Jul 06 '25 09:07 vrtgs

  1. Have a centralized background manager of all sessions.

When a session needs to be dropped, it can send a message via channels in blocking context to the manager and the manager could close the session. Since we are sending the message within the process, it can almost never fail and we can somehow propogate the errors. We have implemented this pattern in: https://github.com/wtdcode/mdbx-remote/blob/master/crates/mdbx-remote/src/txn_manager.rs#L16 for your reference. I also saw some other crates using this to workaround. But the drawback is that we have a background task running but users are not aware. This might be surprising to users.

I mean thats more or less kind equivlent to spawning a task in a global runtime (current workaround)

I examine a bit more about the support.rs and it looks so but the implementation seems slightly flawed.

https://github.com/Vrtgs/thirtyfour/blob/99129f953719bf8a126be76ef523c117095a2359/thirtyfour/src/support.rs#L108-L116

Here spawn_blocking will return immediately. This effectively makes the Drop being concurrently with other Drop. This might break some other assumptions about the drop order. It should also block current thread somehow to maintain the correct semantics of Drop and other runtime types (like block_in_place).

For other parts, it looks fine for now.

wtdcode avatar Jul 06 '25 09:07 wtdcode

rx.recv().expect("spawned task should be able to be scheduled properly") Here spawn_blocking will return immediately. This effectively makes the Drop being concurrently with other Drop. This might break some other assumptions about the drop order. It should also block current thread somehow to maintain the correct semantics of Drop and other runtime types (like block_in_place).

For other parts, it looks fine for now.

blocking here seems pretty pointless though? I mean its a pretty major slowdown, and as long as the function is sCheduled it will run to completion that is what blocking code does

there is no guarentee when the webdrider instance is closed, its just that if the webdriver is dropped, it should attempt to close it

vrtgs avatar Jul 09 '25 07:07 vrtgs

rx.recv().expect("spawned task should be able to be scheduled properly") Here spawn_blocking will return immediately. This effectively makes the Drop being concurrently with other Drop. This might break some other assumptions about the drop order. It should also block current thread somehow to maintain the correct semantics of Drop and other runtime types (like block_in_place). For other parts, it looks fine for now.

blocking here seems pretty pointless though? I mean its a pretty major slowdown, and as long as the function is sCheduled it will run to completion that is what blocking code does

there is no guarentee when the webrider instance is closed, its just that if the webdriver is dropped, it should attempt to close it

I think it should blocking because it is blocking in other branches, no?

wtdcode avatar Jul 09 '25 07:07 wtdcode

I think it should blocking because it is blocking in other branches, no?

well, the reason it blocks in other branches is because it can, one case is its dropped outside a tokio contex, so yes it canblock, the other its in a call to drop_in_place and can block, but in this case its in a single threaded runtime, yes the code can block, it gets a new http client not bound to the previous runtime therefore it can 100% block, but my thinking was blocking a singl e threaded context is pretty expensive compared to the other branches, and even then, to block you still need to spawn a new thread, so the argument is really just, do we wait for the request to go through? or should you not, why so, why not?

vrtgs avatar Jul 09 '25 07:07 vrtgs

I think it should blocking because it is blocking in other branches, no?

well, the reason it blocks in other branches is because it can, one case is its dropped outside a tokio contex, so yes it canblock, the other its in a call to drop_in_place and can block, but in this case its in a single threaded runtime, yes the code can block, it gets a new http client not bound to the previous runtime therefore it can 100% block, but my thinking was blocking a singl e threaded context is pretty expensive compared to the other branches, and even then, to block you still need to spawn a new thread, so the argument is really just, do we wait for the request to go through? or should you not, why so, why not?

I fully agree with your last statement. My idea is that we should not wait for that requests since the error is impossible to propagate and we are doing our best efforts to release resources. Therefore, maybe it is more reasonable to non-block in all cases?

wtdcode avatar Jul 09 '25 07:07 wtdcode