End-to-End Tests on Generated Pipelines using Puppeteer
This PR will implement an end-to-end testing suite that runs through generated and manual pipelines using Puppeteer.
@CodyCBakerPhD Before we can push forward on this reliably, we have to get the GIN testing data downloaded in the Github Actions workflow. I noticed how you're doing this in NeuroConv, but it wasn't clear how to specify where the datasets are downloaded to. For this case, I've made it an assumption that the E2E tests must have the GIN data at ~/NWB_GUIDE/test-data, as it isn't clear how to pass this path dynamically using vitests.
Can you clarify how we'd install the GIN data at this path for Github Actions to use?
Can you clarify how we'd install the GIN data at this path for Github Actions to use?
Most reliable way would probably be to cd to the path you want to save them for vitetest to find them (by running aws s3 cp --recursive ${{ secrets.S3_GIN_BUCKET }}/behavior_testing_data ./behavior_testing_data since . is current location; you can otherwise try to alter the second path of the CLI but AWS CLI in general can be finicky so this might be more reliable)
Though, is this just for the tutorial data? Did you want to add cases from the general data based testing suite?
If tutorial only, might be worth finishing up #530 to simplify the issue altogether
You mean this PR? https://github.com/NeurodataWithoutBorders/nwb-guide/pull/530
Either one is fine. Depends on how comprehensive we want to be (e.g. we could quite easily run through all the YAML pipelines automatically).
It would be great if all the tests worked for any user without any configuration (i.e. we have the data included somehow), but there are ways to seamlessly transition between the two states.
Ah, yes, #530
It would be great if all the tests worked for any user without any configuration (i.e. we have the data included somehow)
That's pretty tough to do in a way that scales well (even in actions we utilize caching) and is really interoperable
Simplest approach is for us to require the data to exist, and simplest approach for data to exist is to use tools specifically built for data transfer
#530 still requires that the testing data is on the user's computer, so we'll still have to install on the CI in a default location.
You're right, though, we will want users to be interacting with a multi-subject, multi-session dataset for the tutorial—so it would be nice to have this merged. I'll use the generated dataset for our manual pipeline test.
That's pretty tough to do in a way that scales well (even in actions we utilize caching) and is really interoperable
Got it, thank you for the clarification. I'll use the approach you've defined above.
still requires that the testing data is on the user's computer
What do you mean by 'testing data'? That PR does not require the GIN example data, it generates synthetic files, if those are the ones you're talking about. But since we're the ones making them we have full control where they get saved
Oh I see, I read your changes incorrectly. base_path is for where the generated data will be output.
Do you want to run E2E tests using the testing data at all, or d'you think that generating these files is sufficient—at least for now?
Do you want to run E2E tests using the testing data at all, or d'you think that generating these files is sufficient—at least for now?
I'll continue to think about it
Since this mainly came up as a way to automatically keep tutorial screenshots up to date, I think focusing on that first and leaving open the possibility of full real data-based testing suite going into the future might just be the best way to go
Got it. Changed the name of this PR and will keep it in draft mode.
I'll work on #530 until merged, then continue to build out #582.
@garrettmflynn Is coverage possible to track via e2e tests? That could reduce amount of other tests we need to add to increase it
Hmm it looks possible but could be somewhat complicated. Having some issues figuring it out because of blocked tests after adding the DANDI_CACHE environment variable—but I'll try a few things and get back to you once we figure that out.
@garrettmflynn Has this been replaced?
Technically no—though we have pretty much decided not to pursue this.
This particular PR prepares the E2E tests to run through all of the generated test pipelines. Something outside the scope of the tutorials, but still possibly useful.
Do we want to approach this or just close this out?
I see
Hm...
I'm not against slower CI for the sake of more thorough testing, but to do so for every generated pipeline may be considered excessive...
Perhaps we just add an ophys one, and then resolve to add new ones for particular generated pipelines that are able to reproduce any bugs that get reported on a rolling basis?
Although, another thing I assume we could do is have a full battery of e2e tests defined, but only run them on say a daily CRON basis so they don't slow down or block PRs, but we still get regular feedback on every generated testing pipeline
Yeah I was thinking this too—even just to have the full suite available for a local test.
Seems like both approaches are useful and on the roadmap. In that case, we should keep this open and I'll adapt this PR for the discussed use-cases.
The general workflow is complete, though there are now a few additional issues that have been uncovered. In summary:
- Suite2P + Caiman: Fluorescence data is not handled correctly
- Likely just have to assert that additional properties are okay for this. Also must fix some rendering changes made recently
- Scanimage:
ValueError: Frame index (0) exceeds number of frames (0)(full trace below) - TDT: Executes fine, but I'd noticed that the Electrodes are not updated after first run if you change the gain—likely because of how arrays are merged and prior values cached
The interfaces affected by (1) and (2) have been excluded from tests to demonstrate that this can pass.
Full ScanImage Trace
Traceback (most recent call last):
File "/Users/garrettflynn/Documents/GitHub/nwb-guide/pyflask/apis/neuroconv.py", line 97, in post
return get_metadata_schema(
File "/Users/garrettflynn/Documents/GitHub/nwb-guide/pyflask/manageNeuroconv/manage_neuroconv.py", line 387, in get_metadata_schema
converter = instantiate_custom_converter(resolved_source_data, interfaces)
File "/Users/garrettflynn/Documents/GitHub/nwb-guide/pyflask/manageNeuroconv/manage_neuroconv.py", line 358, in instantiate_custom_converter
return CustomNWBConverter(source_data)
File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/neuroconv/nwbconverter.py", line 79, in __init__
self.data_interface_objects = {
File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/neuroconv/nwbconverter.py", line 80, in <dictcomp>
name: data_interface(**source_data[name])
File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/neuroconv/datainterfaces/ophys/scanimage/scanimageimaginginterfaces.py", line 60, in __new__
return ScanImageMultiPlaneImagingInterface(
File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/neuroconv/datainterfaces/ophys/scanimage/scanimageimaginginterfaces.py", line 277, in __init__
super().__init__(
File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/neuroconv/datainterfaces/ophys/baseimagingextractorinterface.py", line 38, in __init__
self.imaging_extractor = self.get_extractor()(**source_data)
File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/roiextractors/extractors/tiffimagingextractors/scanimagetiffimagingextractor.py", line 182, in __init__
super().__init__(imaging_extractors=imaging_extractors)
File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/roiextractors/volumetricimagingextractor.py", line 28, in __init__
self._check_consistency_between_imaging_extractors(imaging_extractors)
File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/roiextractors/volumetricimagingextractor.py", line 64, in _check_consistency_between_imaging_extractors
values = [getattr(extractor, method)() for extractor in imaging_extractors]
File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/roiextractors/volumetricimagingextractor.py", line 64, in <listcomp>
values = [getattr(extractor, method)() for extractor in imaging_extractors]
File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/roiextractors/extractors/tiffimagingextractors/scanimagetiffimagingextractor.py", line 441, in get_dtype
return self.get_frames(0).dtype
File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/roiextractors/extractors/tiffimagingextractors/scanimagetiffimagingextractor.py", line 336, in get_frames
self.check_frame_inputs(frame_idxs[-1])
File "/Users/garrettflynn/miniconda3/envs/nwb-guide/lib/python3.9/site-packages/roiextractors/extractors/tiffimagingextractors/scanimagetiffimagingextractor.py", line 457, in check_frame_inputs
raise ValueError(f"Frame index ({frame}) exceeds number of frames ({self._num_frames}).")
ValueError: Frame index (0) exceeds number of frames (0).
Temporarily fixing ScanImage by switching to ophys_testing_data/imaging_datasets/ScanImage/sample_scanimage_version_3_8.tiff instead of scanimage_20220801_single.tif
@CodyCBakerPhD So it also looks like tests are failing because both the original E2E tests (now tutorial.tests.ts) and the new pipelines.tests.ts file rely on an instance of the Electron app opening and this isn't coordinated enough for them both to work.
I can spend some time teasing out the problem and making sure we can run these using the general test command—but I'd thought I'd ask about the general testing approach we'd like to move forward with first.
Locally, I'm running these tests independently and have created new scripts in the package.json to handle this (npm run test:pipelines and npm run test:tutorial). Would we want to break these out into different CI workflows or different steps in the same workflow? Or should we continue optimizing for a single call to npm run test?
Sounds like different CI workflows is best then? Most modular and parallel approach
@CodyCBakerPhD Should be working now! Will approach the TDT issue separately
@garrettmflynn CI does not like something in the dependencies ATM - on any platform it seems
Thanks for flagging! Should be fixed.
Also was able to knock out the TDT issue here. Turns out it was more related to these changes than I thought
Let me know when all CI are passing
Also looks like docs have a build issue too
Are all the tutorial changes intended in this PR? I had thought they were from a different branch
Looked over everything, just one small question about the form of an error message and the rest looks good to go!
@garrettmflynn Thanks for the hard work on this, this is gonna be great for maintainability!