core
core copied to clipboard
Support async `dispose`.
⚠️ THIS IS WIP ⚠️ Looking for feedback to this approach. 😄
It looks like there started out work to implement async dispose as seen in LinkedList
as if the expected result of dispose()
was a Promise
:
while (x !== null) {
promises.push(x.dispose())
x = x.next
}
return Promise.all(promises)
Unfortunately this behaviour exists nowhere else in the codebase and dispose()
is called immediately everywhere. This is problematic if one wishes to use most
to do resource management and the results carry several streams deep. For example:
const selenium = multicast(switchLatest(combineArray((seleniumInstall) => {
return newStream((sink, scheduler) => {
const port = 4444;
let child;
const options = {
...seleniumInstall.options,
seleniumArgs: ['-port', port],
};
return asap({
run: (t) => {
if (child) {
sink.event(t, child);
return;
}
start(options, (err, process) => {
if (err) {
sink.error(t, err);
} else {
process.port = port;
process.url = `http://localhost:${port}/wd/hub`;
process.capabilities = {};
child = process;
sink.event(t, process);
}
});
},
dispose: () => {
return kill(child);
},
error: (t, e) => {
child = null;
sink.error(t, e);
},
}, scheduler);
});
}, [
seleniumInstall,
])));
const leadfoot = switchLatest(combineArray((selenium) => {
const server = new Server(selenium.url);
server.capabilities = selenium.capabilities;
return now(server);
}, [
selenium,
]));
const session = switchLatest(combineArray((leadfoot) => {
return newStream((sink, scheduler) => {
let session = null;
return asap({
run: (t) => {
session = leadfoot.createSession({
...leadfoot.capabilities,
browserName: 'chrome',
}).then(
(s) => {
session = s;
sink.event(t, session);
},
(err) => {
sink.error(t, err);
}
);
},
dispose: () => {
return session.quit();
},
error: (t, e) => {
sink.error(t, e);
},
}, scheduler);
});
}, [
leadfoot,
]));
const test1 = awaitPromises(combineArray(async (session) => {
await session.get('http://www.apple.com');
const title = await session.getPageTitle();
expect(title).to.contain('Apple');
}, [
session,
]));
Previously the test1
stream would end and the selenium server would be killed immediately because its dispose()
method would be called without waiting for the dipose()
of session
.
The changes here are pretty redumentary and need some cleanup, but it allows one to return a Promise
from dispose()
and wait for the result to finish before continuing to dispose of resources.
TODO:
- [ ] Where can we use
disposeParallel
instead ofdisposeSerial
? - [ ] Is
Promise.all
inLinkedList
actually desirable? Errors are not collected the same way as they are indisposeAll
. - [ ] Using
most
for resource management is maybe a contrived but interesting use case – I'm sure I'm doing something non-optimally here to begin with. - [x] Fix tests.
These changes are not breaking – if nothing in your dispose()
chain is async then no promises are used and errors get thrown as before. I'm not sure if two pathways is the best (maybe always use promises?) but the current version means no changes for consumers.
Hey @izaakschroeder. Thanks for opening this.
I see commits rolling in as I type this, so I wanted to get a message out to you quickly: let's discuss this before you put more energy into it.
Originally, most
allowed async disposal, but we moved away from it for reasons we can discuss. I'd like to see if there are ways we can help with your selenium shutdown use case without reintroducing async disposal.
@briancavalier I thought about this a bit before too. I thought maybe I could track the open sessions and then just Promise.all()
on quitting them but that depends on leadfoot
and my preference would be not to have the selenium
task tracking leadfoot
information. Open to ideas though.
I'm assuming you moved away from async disposal for perf reasons?
@briancavalier Even if you don't want async disposal back, I think there are a few commits in here you can use anyway. The meat of the async disposal code is in 748c95083b57351d3aff24f7f38cedcf8e2b851a and is maybe 30 lines worth of changes. I think disposeAll
should be renamed to createMultiDisposer
or something and then have disposeAll
actually do what it says – be the n-ary version of dispose
. Anyway, just my 2¢.
Hi @izaakschroeder. Yeah, there were performance implications, but the biggest reason was that use cases for async dispose proved to be extremely rare in practice. Originally, I had thought it would be a common thing, but it turned out that in practice it wasn't. In fact, this may be the only one I've ever run into! So, we decided to remove the async stuff (we missed LinkedList, oops!), and it simplified a lot of code.
Thanks for giving some sample code. I'm not sure I fully understand all of it yet, but maybe we can try to boil it down to a few primary concerns. Are they something along the lines of:
- Ensuring that when a the selenium or session streams end normally, that you can properly shutdown selenium, and
- Ensuring that if an error happens, you can properly shutdown selenium and the error is propagated through the stream
Am I on the right track?
@briancavalier Yeah. I mean this is just me playing with streams for integration tests right now. I've seen some very heavy lifting by people doing stuff like re-running tests, restarting selenium, etc. and it's all been rather messy. Seeing if streams were the answer. My thought process was:
- A service/resource (e.g. selenium) is a stream that emits a single value when ready and then nothing until it fails or ends. If it fails one can use stream error recovery to emit a new value that represents a new instance of the service ad infinitum.
- Resources/services can depend on one another. Basically a
map
orcombine
which I would use without so much wrapping except there's no way todispose
via otherwise. - Tests are depend on resources/services. When resources/services are no longer needed they are disposed of – presumably this is the case when all tests have emitted their results or have failed in some way.
In my example here testX
depends on session
depends on selenium
. When testX
ends then session
should be disposed of, when session
is completely disposed then selenium
should be disposed too.
Thanks for feedback so far 😄
The changes I've made in this PR are enough to cause that "serial" effect to happen and the result is selenium
is disposed of correctly. If no one is using async disposers then the new code pathway is never hit, no promises are created and there shouldn't be a big perf issue? 🤞
Thanks for the extra info about the use case and the dependencies :+1:. I should have time tomorrow to take a closer look and give this a bit more thought.
Awesome thanks! 🙏
Hey @izaakschroeder, a few more questions after looking at the example:
- It looks like
start
probably spawns a new selenium server process. Is that right? If so, what's the relationship between that process and the sessions created byleadfoot.createSession
--are they 1-to-1, or 1-to-many? - Can you point me to documentation for the selenium libs you're using? Specifically, I'd like to look at docs for createSession and session.quit.
Have you thought about other ways to approach this? For example, have you considered explicitly using streams to represent the lifecycle of the seleniums server and session, instead of using the dispose chain? That is, triggering shutdown using events rather than disposal? It's something that came up while the core team was discussing your example.
Another interesting approach might be to embed one stream graph into another, i.e. via higher order streams. I'm imagining an outer stream that manages the creation, and later shutdown, of the server and session, and then that maps the session to an inner stream of tests. I'm probably not fully understanding how your vision for how tests might work, but maybe that will give you some ideas.
Hey @briancavalier thanks for the update 😄
- Yes the relation is 1-to-many. As in 1 selenium server, many sessions. It's a bit clunky, but currently it's 1 new session per 1 new sink. Ideally I'd like to have it such that if the test changes or needs to be re-run then a new session is spawned for that test. So for the tests it would be one-to-many: one test has many sessions, depending on how many times the test has to be run (e.g. if it needs to be retried from failure).
- Sure the docs are here: https://theintern.io/leadfoot/module-leadfoot_Server.html#createSession.
I'm not the most experienced with streams, I'm totally open to other approaches. One could use something other than the dispose chain but the big deal is that the dispose chain knows when the resource is no longer used, which is really what I'm after – not having to do ref counting myself.
There are other sub-processes that need to be involved in this too like spinning up a mock API server for each test in addition to opening a selenium session. I just kind of want one way to have my resources freed when they're no longer needed and dispose
seemed basically perfect for that. If you have an example of your alternative approach I'd love to see it though.
Hey @izaakschroeder, sorry I haven't replied. I've been focused on getting @most/core
1.0 ready for release. I'll have a bit more brain cycles once it's released.
No rush @briancavalier thanks for the feedback so far.
@briancavalier any updates? 😄
Hey @izaakschroeder, sorry for the late reply yet again. I appreciate your patience :) I've been very busy at work since the beginning of the year, and haven't had much time to spend on open source.
I just read through all the comments again, and am looking at the changes to refresh my memory. I'll bring the issue up with the core team for discussion this week.
One concern I remember having is about types. It looks like even though the JavaScript implementation can be made compatible with @most/core
1.x, I'm not sure yet if the Flow and TS type defs could be, since they currently say that Disposables must return void
. Switching to ?Promise<void>
might work out, but we'd want to make sure. Have you given any thought to that?