nwb-guide End-to-End Tests on Generated Pipelines using Puppeteer

This PR will implement an end-to-end testing suite that runs through generated and manual pipelines using Puppeteer.

@CodyCBakerPhD Before we can push forward on this reliably, we have to get the GIN testing data downloaded in the Github Actions workflow. I noticed how you're doing this in NeuroConv, but it wasn't clear how to specify where the datasets are downloaded to. For this case, I've made it an assumption that the E2E tests must have the GIN data at ~/NWB_GUIDE/test-data, as it isn't clear how to pass this path dynamically using vitests.

Can you clarify how we'd install the GIN data at this path for Github Actions to use?

Feb 01 '24 19:02 garrettmflynn

Can you clarify how we'd install the GIN data at this path for Github Actions to use?

Most reliable way would probably be to cd to the path you want to save them for vitetest to find them (by running aws s3 cp --recursive ${{ secrets.S3_GIN_BUCKET }}/behavior_testing_data ./behavior_testing_data since . is current location; you can otherwise try to alter the second path of the CLI but AWS CLI in general can be finicky so this might be more reliable)

Though, is this just for the tutorial data? Did you want to add cases from the general data based testing suite?

If tutorial only, might be worth finishing up #530 to simplify the issue altogether

Feb 01 '24 20:02 CodyCBakerPhD

You mean this PR? https://github.com/NeurodataWithoutBorders/nwb-guide/pull/530

Feb 01 '24 20:02 garrettmflynn

Either one is fine. Depends on how comprehensive we want to be (e.g. we could quite easily run through all the YAML pipelines automatically).

It would be great if all the tests worked for any user without any configuration (i.e. we have the data included somehow), but there are ways to seamlessly transition between the two states.

Feb 01 '24 20:02 garrettmflynn

Ah, yes, #530

Feb 01 '24 20:02 CodyCBakerPhD

It would be great if all the tests worked for any user without any configuration (i.e. we have the data included somehow)

That's pretty tough to do in a way that scales well (even in actions we utilize caching) and is really interoperable

Simplest approach is for us to require the data to exist, and simplest approach for data to exist is to use tools specifically built for data transfer

Feb 01 '24 20:02 CodyCBakerPhD

#530 still requires that the testing data is on the user's computer, so we'll still have to install on the CI in a default location.

You're right, though, we will want users to be interacting with a multi-subject, multi-session dataset for the tutorial—so it would be nice to have this merged. I'll use the generated dataset for our manual pipeline test.

Feb 01 '24 20:02 garrettmflynn

That's pretty tough to do in a way that scales well (even in actions we utilize caching) and is really interoperable

Got it, thank you for the clarification. I'll use the approach you've defined above.

Feb 01 '24 20:02 garrettmflynn

still requires that the testing data is on the user's computer

What do you mean by 'testing data'? That PR does not require the GIN example data, it generates synthetic files, if those are the ones you're talking about. But since we're the ones making them we have full control where they get saved

Feb 01 '24 20:02 CodyCBakerPhD

Oh I see, I read your changes incorrectly. base_path is for where the generated data will be output.

Do you want to run E2E tests using the testing data at all, or d'you think that generating these files is sufficient—at least for now?

Feb 01 '24 20:02 garrettmflynn

Do you want to run E2E tests using the testing data at all, or d'you think that generating these files is sufficient—at least for now?

I'll continue to think about it

Since this mainly came up as a way to automatically keep tutorial screenshots up to date, I think focusing on that first and leaving open the possibility of full real data-based testing suite going into the future might just be the best way to go

Feb 01 '24 20:02 CodyCBakerPhD

Got it. Changed the name of this PR and will keep it in draft mode.

I'll work on #530 until merged, then continue to build out #582.

Feb 01 '24 20:02 garrettmflynn

@garrettmflynn Is coverage possible to track via e2e tests? That could reduce amount of other tests we need to add to increase it

Feb 21 '24 17:02 CodyCBakerPhD

Hmm it looks possible but could be somewhat complicated. Having some issues figuring it out because of blocked tests after adding the DANDI_CACHE environment variable—but I'll try a few things and get back to you once we figure that out.

Feb 21 '24 19:02 garrettmflynn

@garrettmflynn Has this been replaced?

Apr 10 '24 06:04 CodyCBakerPhD

Technically no—though we have pretty much decided not to pursue this.

This particular PR prepares the E2E tests to run through all of the generated test pipelines. Something outside the scope of the tutorials, but still possibly useful.

Do we want to approach this or just close this out?

Apr 10 '24 17:04 garrettmflynn

I see

Hm...

I'm not against slower CI for the sake of more thorough testing, but to do so for every generated pipeline may be considered excessive...

Perhaps we just add an ophys one, and then resolve to add new ones for particular generated pipelines that are able to reproduce any bugs that get reported on a rolling basis?

Apr 10 '24 17:04 CodyCBakerPhD

Although, another thing I assume we could do is have a full battery of e2e tests defined, but only run them on say a daily CRON basis so they don't slow down or block PRs, but we still get regular feedback on every generated testing pipeline

Apr 11 '24 04:04 CodyCBakerPhD

Yeah I was thinking this too—even just to have the full suite available for a local test.

Seems like both approaches are useful and on the roadmap. In that case, we should keep this open and I'll adapt this PR for the discussed use-cases.

Apr 11 '24 04:04 garrettmflynn

The general workflow is complete, though there are now a few additional issues that have been uncovered. In summary:

Suite2P + Caiman: Fluorescence data is not handled correctly
- Likely just have to assert that additional properties are okay for this. Also must fix some rendering changes made recently
Scanimage: ValueError: Frame index (0) exceeds number of frames (0) (full trace below)
TDT: Executes fine, but I'd noticed that the Electrodes are not updated after first run if you change the gain—likely because of how arrays are merged and prior values cached

The interfaces affected by (1) and (2) have been excluded from tests to demonstrate that this can pass.

Full ScanImage Trace

Traceback (most recent call last):
  File "/Users/garrettflynn/Documents/GitHub/nwb-guide/pyflask/apis/neuroconv.py", line 97, in post
    return get_metadata_schema(
  File "/Users/garrettflynn/Documents/GitHub/nwb-guide/pyflask/manageNeuroconv/manage_neuroconv.py", line 387, in get_metadata_schema
    converter = instantiate_custom_converter(resolved_source_data, interfaces)
  File "/Users/garrettflynn/Documents/GitHub/nwb-guide/pyflask/manageNeuroconv/manage_neuroconv.py", line 358, in instantiate_custom_converter
    return CustomNWBConverter(source_data)
  File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/neuroconv/nwbconverter.py", line 79, in __init__
    self.data_interface_objects = {
  File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/neuroconv/nwbconverter.py", line 80, in <dictcomp>
    name: data_interface(**source_data[name])
  File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/neuroconv/datainterfaces/ophys/scanimage/scanimageimaginginterfaces.py", line 60, in __new__
    return ScanImageMultiPlaneImagingInterface(
  File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/neuroconv/datainterfaces/ophys/scanimage/scanimageimaginginterfaces.py", line 277, in __init__
    super().__init__(
  File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/neuroconv/datainterfaces/ophys/baseimagingextractorinterface.py", line 38, in __init__
    self.imaging_extractor = self.get_extractor()(**source_data)
  File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/roiextractors/extractors/tiffimagingextractors/scanimagetiffimagingextractor.py", line 182, in __init__
    super().__init__(imaging_extractors=imaging_extractors)
  File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/roiextractors/volumetricimagingextractor.py", line 28, in __init__
    self._check_consistency_between_imaging_extractors(imaging_extractors)
  File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/roiextractors/volumetricimagingextractor.py", line 64, in _check_consistency_between_imaging_extractors
    values = [getattr(extractor, method)() for extractor in imaging_extractors]
  File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/roiextractors/volumetricimagingextractor.py", line 64, in <listcomp>
    values = [getattr(extractor, method)() for extractor in imaging_extractors]
  File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/roiextractors/extractors/tiffimagingextractors/scanimagetiffimagingextractor.py", line 441, in get_dtype
    return self.get_frames(0).dtype
  File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/roiextractors/extractors/tiffimagingextractors/scanimagetiffimagingextractor.py", line 336, in get_frames
    self.check_frame_inputs(frame_idxs[-1])
  File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/roiextractors/extractors/tiffimagingextractors/scanimagetiffimagingextractor.py", line 457, in check_frame_inputs
    raise ValueError(f"Frame index ({frame}) exceeds number of frames ({self._num_frames}).")
ValueError: Frame index (0) exceeds number of frames (0).

May 10 '24 00:05 garrettmflynn

Temporarily fixing ScanImage by switching to ophys_testing_data/imaging_datasets/ScanImage/sample_scanimage_version_3_8.tiff instead of scanimage_20220801_single.tif

May 10 '24 00:05 garrettmflynn

@CodyCBakerPhD So it also looks like tests are failing because both the original E2E tests (now tutorial.tests.ts) and the new pipelines.tests.ts file rely on an instance of the Electron app opening and this isn't coordinated enough for them both to work.

I can spend some time teasing out the problem and making sure we can run these using the general test command—but I'd thought I'd ask about the general testing approach we'd like to move forward with first.

Locally, I'm running these tests independently and have created new scripts in the package.json to handle this (npm run test:pipelines and npm run test:tutorial). Would we want to break these out into different CI workflows or different steps in the same workflow? Or should we continue optimizing for a single call to npm run test?

May 10 '24 22:05 garrettmflynn

Sounds like different CI workflows is best then? Most modular and parallel approach

May 10 '24 23:05 CodyCBakerPhD

@CodyCBakerPhD Should be working now! Will approach the TDT issue separately

May 13 '24 17:05 garrettmflynn

@garrettmflynn CI does not like something in the dependencies ATM - on any platform it seems

May 13 '24 17:05 CodyCBakerPhD

Thanks for flagging! Should be fixed.

Also was able to knock out the TDT issue here. Turns out it was more related to these changes than I thought

May 13 '24 18:05 garrettmflynn

Let me know when all CI are passing

Also looks like docs have a build issue too

Are all the tutorial changes intended in this PR? I had thought they were from a different branch

May 13 '24 18:05 CodyCBakerPhD

Looked over everything, just one small question about the form of an error message and the rest looks good to go!

May 14 '24 03:05 CodyCBakerPhD

@garrettmflynn Thanks for the hard work on this, this is gonna be great for maintainability!

May 14 '24 15:05 CodyCBakerPhD