webknossos icon indicating copy to clipboard operation
webknossos copied to clipboard

Add end to end tests for zarr streaming

Open MichaelBuessemeyer opened this issue 1 year ago • 3 comments

Detailed Description

Zarr streaming is a crucial feature for our pipelines. Thus, we should ensure that this feature is tested to avoid easily catchable mistakes breaking the streaming mechanism. Therefore, I suggest to write end to end tests testing all streaming routes, especially the ones that directly request an array

MichaelBuessemeyer avatar Aug 15 '24 09:08 MichaelBuessemeyer

This seems to be pretty complicated. Some thoughts:

End to end tests

  • End to end tests only interact with the webknossos API at the moment. There are no tests for the datastore. As I understand there is just no datastore when running e2e tests.
  • There are no datasets in the test. The database contains datasets but no data. So there would need to be a test dataset included in the setup. Not yet sure how to accomplish this.

That is why I investigated using the backend tests.

  • The dataset problematic remains
  • If we want to test the actual routes, we need to put the controller under test. This requires mocking a lot of services which may be quite a lot of effort.

So I think if we want to test the logic, it would be best to use a backend test that tests the content of the controllers and not the controller methods themselves. The necessity of including a dataset remains.

frcroth avatar Oct 09 '24 12:10 frcroth

My thoughts:

There are no datasets in the test. The database contains datasets but no data. So there would need to be a test dataset included in the setup. Not yet sure how to accomplish this.

What about using a remote dataset for this? This could be configured to be served by the datastore and then be streamed as zarr for some kind of snapshot tests :thinking:

MichaelBuessemeyer avatar Oct 10 '24 09:10 MichaelBuessemeyer

I think e2e tests would be nice here, in order to avoid all the mocking and to test what is actually ultimately used.

I’m not super familiar with how the e2e tests work, but it should be possible to launch a datastore module alongside the wk core backend module, and then adding it to the datastores postgres table. Of course, this may slow down the CI again, but if we want tests for this stuff, I think that’s what we need. Same for #8099

Including a small sample dataset in the repository seems acceptable to me. I’d rather avoid relying on remote data unless really necessary, because problems with the remote availability would make our tests flaky, and possibly slow. This is somewhat expected in unit tests like in the DataVaultTestSuite, but for outbound zarr streaming, I’d prefer to isolate that. What do you think?

Maybe @normanrz also has an opinion on these matters?

fm3 avatar Oct 11 '24 07:10 fm3