core icon indicating copy to clipboard operation
core copied to clipboard

Support async `dispose`.

Open izaakschroeder opened this issue 6 years ago • 15 comments

⚠️ THIS IS WIP ⚠️ Looking for feedback to this approach. 😄

It looks like there started out work to implement async dispose as seen in LinkedList as if the expected result of dispose() was a Promise:

while (x !== null) {
  promises.push(x.dispose())
  x = x.next
}

return Promise.all(promises)

Unfortunately this behaviour exists nowhere else in the codebase and dispose() is called immediately everywhere. This is problematic if one wishes to use most to do resource management and the results carry several streams deep. For example:

const selenium = multicast(switchLatest(combineArray((seleniumInstall) => {
  return newStream((sink, scheduler) => {
    const port = 4444;
    let child;
    const options = {
      ...seleniumInstall.options,
      seleniumArgs: ['-port', port],
    };
    return asap({
      run: (t) => {
        if (child) {
          sink.event(t, child);
          return;
        }
        start(options, (err, process) => {
          if (err) {
            sink.error(t, err);
          } else {
            process.port = port;
            process.url = `http://localhost:${port}/wd/hub`;
            process.capabilities = {};
            child = process;
            sink.event(t, process);
          }
        });
      },
      dispose: () => {
        return kill(child);
      },
      error: (t, e) => {
        child = null;
        sink.error(t, e);
      },
    }, scheduler);
  });
}, [
  seleniumInstall,
])));

const leadfoot = switchLatest(combineArray((selenium) => {
  const server = new Server(selenium.url);
  server.capabilities = selenium.capabilities;
  return now(server);
}, [
  selenium,
]));

const session = switchLatest(combineArray((leadfoot) => {
  return newStream((sink, scheduler) => {
    let session = null;
    return asap({
      run: (t) => {
        session = leadfoot.createSession({
          ...leadfoot.capabilities,
          browserName: 'chrome',
        }).then(
          (s) => {
            session = s;
            sink.event(t, session);
          },
          (err) => {
            sink.error(t, err);
          }
        );
      },
      dispose: () => {
        return session.quit();
      },
      error: (t, e) => {
        sink.error(t, e);
      },
    }, scheduler);
  });
}, [
  leadfoot,
]));

const test1 = awaitPromises(combineArray(async (session) => {
  await session.get('http://www.apple.com');
  const title = await session.getPageTitle();
  expect(title).to.contain('Apple');
}, [
  session,
]));

Previously the test1 stream would end and the selenium server would be killed immediately because its dispose() method would be called without waiting for the dipose() of session.

The changes here are pretty redumentary and need some cleanup, but it allows one to return a Promise from dispose() and wait for the result to finish before continuing to dispose of resources.

TODO:

  • [ ] Where can we use disposeParallel instead of disposeSerial?
  • [ ] Is Promise.all in LinkedList actually desirable? Errors are not collected the same way as they are in disposeAll.
  • [ ] Using most for resource management is maybe a contrived but interesting use case – I'm sure I'm doing something non-optimally here to begin with.
  • [x] Fix tests.

These changes are not breaking – if nothing in your dispose() chain is async then no promises are used and errors get thrown as before. I'm not sure if two pathways is the best (maybe always use promises?) but the current version means no changes for consumers.

izaakschroeder avatar Oct 30 '17 23:10 izaakschroeder

Hey @izaakschroeder. Thanks for opening this.

I see commits rolling in as I type this, so I wanted to get a message out to you quickly: let's discuss this before you put more energy into it.

Originally, most allowed async disposal, but we moved away from it for reasons we can discuss. I'd like to see if there are ways we can help with your selenium shutdown use case without reintroducing async disposal.

briancavalier avatar Oct 31 '17 00:10 briancavalier

@briancavalier I thought about this a bit before too. I thought maybe I could track the open sessions and then just Promise.all() on quitting them but that depends on leadfoot and my preference would be not to have the selenium task tracking leadfoot information. Open to ideas though.

izaakschroeder avatar Oct 31 '17 01:10 izaakschroeder

I'm assuming you moved away from async disposal for perf reasons?

izaakschroeder avatar Oct 31 '17 01:10 izaakschroeder

@briancavalier Even if you don't want async disposal back, I think there are a few commits in here you can use anyway. The meat of the async disposal code is in 748c95083b57351d3aff24f7f38cedcf8e2b851a and is maybe 30 lines worth of changes. I think disposeAll should be renamed to createMultiDisposer or something and then have disposeAll actually do what it says – be the n-ary version of dispose. Anyway, just my 2¢.

izaakschroeder avatar Oct 31 '17 01:10 izaakschroeder

Hi @izaakschroeder. Yeah, there were performance implications, but the biggest reason was that use cases for async dispose proved to be extremely rare in practice. Originally, I had thought it would be a common thing, but it turned out that in practice it wasn't. In fact, this may be the only one I've ever run into! So, we decided to remove the async stuff (we missed LinkedList, oops!), and it simplified a lot of code.

Thanks for giving some sample code. I'm not sure I fully understand all of it yet, but maybe we can try to boil it down to a few primary concerns. Are they something along the lines of:

  1. Ensuring that when a the selenium or session streams end normally, that you can properly shutdown selenium, and
  2. Ensuring that if an error happens, you can properly shutdown selenium and the error is propagated through the stream

Am I on the right track?

briancavalier avatar Oct 31 '17 01:10 briancavalier

@briancavalier Yeah. I mean this is just me playing with streams for integration tests right now. I've seen some very heavy lifting by people doing stuff like re-running tests, restarting selenium, etc. and it's all been rather messy. Seeing if streams were the answer. My thought process was:

  • A service/resource (e.g. selenium) is a stream that emits a single value when ready and then nothing until it fails or ends. If it fails one can use stream error recovery to emit a new value that represents a new instance of the service ad infinitum.
  • Resources/services can depend on one another. Basically a map or combine which I would use without so much wrapping except there's no way to dispose via otherwise.
  • Tests are depend on resources/services. When resources/services are no longer needed they are disposed of – presumably this is the case when all tests have emitted their results or have failed in some way.

In my example here testX depends on session depends on selenium. When testX ends then session should be disposed of, when session is completely disposed then selenium should be disposed too.

Thanks for feedback so far 😄

izaakschroeder avatar Oct 31 '17 01:10 izaakschroeder

The changes I've made in this PR are enough to cause that "serial" effect to happen and the result is selenium is disposed of correctly. If no one is using async disposers then the new code pathway is never hit, no promises are created and there shouldn't be a big perf issue? 🤞

izaakschroeder avatar Oct 31 '17 01:10 izaakschroeder

Thanks for the extra info about the use case and the dependencies :+1:. I should have time tomorrow to take a closer look and give this a bit more thought.

briancavalier avatar Oct 31 '17 01:10 briancavalier

Awesome thanks! 🙏

izaakschroeder avatar Oct 31 '17 01:10 izaakschroeder

Hey @izaakschroeder, a few more questions after looking at the example:

  1. It looks like start probably spawns a new selenium server process. Is that right? If so, what's the relationship between that process and the sessions created by leadfoot.createSession--are they 1-to-1, or 1-to-many?
  2. Can you point me to documentation for the selenium libs you're using? Specifically, I'd like to look at docs for createSession and session.quit.

Have you thought about other ways to approach this? For example, have you considered explicitly using streams to represent the lifecycle of the seleniums server and session, instead of using the dispose chain? That is, triggering shutdown using events rather than disposal? It's something that came up while the core team was discussing your example.

Another interesting approach might be to embed one stream graph into another, i.e. via higher order streams. I'm imagining an outer stream that manages the creation, and later shutdown, of the server and session, and then that maps the session to an inner stream of tests. I'm probably not fully understanding how your vision for how tests might work, but maybe that will give you some ideas.

briancavalier avatar Nov 01 '17 01:11 briancavalier

Hey @briancavalier thanks for the update 😄

  1. Yes the relation is 1-to-many. As in 1 selenium server, many sessions. It's a bit clunky, but currently it's 1 new session per 1 new sink. Ideally I'd like to have it such that if the test changes or needs to be re-run then a new session is spawned for that test. So for the tests it would be one-to-many: one test has many sessions, depending on how many times the test has to be run (e.g. if it needs to be retried from failure).
  2. Sure the docs are here: https://theintern.io/leadfoot/module-leadfoot_Server.html#createSession.

I'm not the most experienced with streams, I'm totally open to other approaches. One could use something other than the dispose chain but the big deal is that the dispose chain knows when the resource is no longer used, which is really what I'm after – not having to do ref counting myself.

There are other sub-processes that need to be involved in this too like spinning up a mock API server for each test in addition to opening a selenium session. I just kind of want one way to have my resources freed when they're no longer needed and dispose seemed basically perfect for that. If you have an example of your alternative approach I'd love to see it though.

izaakschroeder avatar Nov 01 '17 03:11 izaakschroeder

Hey @izaakschroeder, sorry I haven't replied. I've been focused on getting @most/core 1.0 ready for release. I'll have a bit more brain cycles once it's released.

briancavalier avatar Nov 06 '17 01:11 briancavalier

No rush @briancavalier thanks for the feedback so far.

izaakschroeder avatar Nov 10 '17 11:11 izaakschroeder

@briancavalier any updates? 😄

izaakschroeder avatar Jan 22 '18 23:01 izaakschroeder

Hey @izaakschroeder, sorry for the late reply yet again. I appreciate your patience :) I've been very busy at work since the beginning of the year, and haven't had much time to spend on open source.

I just read through all the comments again, and am looking at the changes to refresh my memory. I'll bring the issue up with the core team for discussion this week.

One concern I remember having is about types. It looks like even though the JavaScript implementation can be made compatible with @most/core 1.x, I'm not sure yet if the Flow and TS type defs could be, since they currently say that Disposables must return void. Switching to ?Promise<void> might work out, but we'd want to make sure. Have you given any thought to that?

briancavalier avatar Feb 01 '18 13:02 briancavalier