nwb-guide icon indicating copy to clipboard operation
nwb-guide copied to clipboard

TQDM Progress Demo for NWB Inspector

Open garrettmflynn opened this issue 1 year ago • 24 comments

This PR demonstrates working TQDM progress updates from the NWB Inspector when using editable installations of https://github.com/NeurodataWithoutBorders/nwbinspector/pull/440 and https://github.com/catalystneuro/tqdm_publisher.

garrettmflynn avatar Mar 14 '24 22:03 garrettmflynn

Can you add a similar selector to the dandi upload for specifying the number of jobs? (no multi-threading, just # CPU)

CodyCBakerPhD avatar Apr 17 '24 14:04 CodyCBakerPhD

This would significantly change the user flow for this page, so we should probably align on this beforehand.

At present, this request is sent automatically when arriving on the Guided Inspect page if you haven't generated the report already or if the data has changed since you last generated.

Adding options would require that we interrupt this automatic flow, requiring users to manually submit their request for validation (e.g. button press). My gut tells me to render the configuration options if the validation requires (re)generation (using the same criteria above); the user would then be blocked from proceeding until they successfully generated a report.

Is this an agreeable solution?

garrettmflynn avatar Apr 17 '24 14:04 garrettmflynn

We can't/shouldn't assume they want to use every available CPU

I just mean have an advanced/configuration option accessible somehow on the page, the default can maybe be 2 jobs to start with? It's not a required field since we can make some assumptions on defaults so it shouldn't block anything

CodyCBakerPhD avatar Apr 17 '24 15:04 CodyCBakerPhD

Gotcha. Mostly want confirmation as to whether we want to interrupt the current auto-run workflow to allow them to specify these options on the first run, or whether we'd just want them to be able to re-run with a different job count.

garrettmflynn avatar Apr 17 '24 15:04 garrettmflynn

I'm not sure I understand

To run an inspection I need to press the 'Start Inspection' button

Screenshot 2024-04-17 at 11 18 56 AM

(maybe I should have clarified this is for the standalone page - I guess for the pipeline it can be a passive configuration, maybe a global settings for job options that applies to both Inspection and DANDI upload with manual override still possible on the DANDI page to avoid that joblib issue?)

CodyCBakerPhD avatar Apr 17 '24 15:04 CodyCBakerPhD

Ah gotcha. Yeah this is easy

garrettmflynn avatar Apr 17 '24 15:04 garrettmflynn

Updated for the standalone page. Since the default for DANDI is 1 but the default for inspection should likely be max, I've held off on that global value for now.

garrettmflynn avatar Apr 17 '24 16:04 garrettmflynn

Updated the styling of the small bars. Should be ready for review!

garrettmflynn avatar Apr 22 '24 17:04 garrettmflynn

Looks great! There might be a small problem in the future with how the new bars auto-populate on the bottom and push the window towards the top (when we have a large number of items to iterate) but we can work on that in the future

What remains prior to allowing merge here is either specifying the dev branch installations in all the conda environments here on GUIDE, or otherwise waiting to merge this until both TQDM Publisher (which needs more coverage) and NWB Inspector (which is waiting on PR merges) have new releases cut

CodyCBakerPhD avatar Apr 22 '24 19:04 CodyCBakerPhD

Would you prefer one of these approaches over the other?

The only reason I'd want to get this in sooner would be to move onto other progress bars so we can cross this off the Roadmap—though we could also just achieve that through additional PRs using this as a base.

garrettmflynn avatar Apr 22 '24 19:04 garrettmflynn

The first would be a rapid hack, so feel free to include it if you just want to get this merged

The second would be the more official/proper way but would take more work

CodyCBakerPhD avatar Apr 22 '24 19:04 CodyCBakerPhD

@garrettmflynn I believe this is ready to go now right? All packages released and I updated environments here

CodyCBakerPhD avatar May 06 '24 18:05 CodyCBakerPhD

Perfect. Yes, then this should be ready!

garrettmflynn avatar May 06 '24 18:05 garrettmflynn

@garrettmflynn Huh, strange non-linux errors in e2e tests - any idea?

CodyCBakerPhD avatar May 06 '24 19:05 CodyCBakerPhD

Works fine locally. Looks like a stall on the full conversion.

Will just have to trigger the tests again

garrettmflynn avatar May 06 '24 20:05 garrettmflynn

Rerunning isn't solving it after several tries - not sure what would have changed, but maybe the timeout needs to be increased?

CodyCBakerPhD avatar May 06 '24 21:05 CodyCBakerPhD

Hmm, that is odd. Thanks for letting me know!

I'll mess around some more and see what's going on.

garrettmflynn avatar May 06 '24 21:05 garrettmflynn

Still not getting anywhere with this. Seems like it could have to do with this error: https://github.com/puppeteer/puppeteer/issues/10144.

I'll go through this thoroughly tomorrow and see if I can replicate locally. @CodyCBakerPhD Any chance this branch fails for you?

garrettmflynn avatar May 06 '24 23:05 garrettmflynn

I'll go through this thoroughly tomorrow and see if I can replicate locally. @CodyCBakerPhD Any chance this branch fails for you?

I haven't tried locally - my main concern is all about the CI - is there any way to pin back the version to what definitely worked as of commit f4ced13?

CodyCBakerPhD avatar May 07 '24 00:05 CodyCBakerPhD

Note that the only thing that changed from that commit on (aside from updates from main) was my update of the environment: https://github.com/NeurodataWithoutBorders/nwb-guide/pull/676/commits/53bc3a23efc5c2cfee62cb7f6444217103ba17b5

Would that have changed anything? I can also try clearing the caches

CodyCBakerPhD avatar May 07 '24 00:05 CodyCBakerPhD

Good point. Would be worth clearing just to check.

Not sure about pinning

garrettmflynn avatar May 07 '24 00:05 garrettmflynn

Clearing caches did not resolve

Looks like there was a new release of puppeteer 18 hours ago - but the lock file should keep it at whatever the previously working state was right?

Have you tried locally in a fresh environment?

CodyCBakerPhD avatar May 07 '24 01:05 CodyCBakerPhD

That was gonna be my next step tomorrow. Will let you know how it goes!

garrettmflynn avatar May 07 '24 01:05 garrettmflynn

New Python environment maintains the expected behavior. How odd...and other branches work just fine

garrettmflynn avatar May 07 '24 16:05 garrettmflynn

D'you know any way that we can inspect output files from a workflow? Would be useful to inspect the application's behavior through screenshots—since, while the tests were "passing", it didn't have tqdm_publisher specified as a dependency before. So there was likely an error that didn't interrupt the general flow of the application, whereas now we have something more disruptive at hand.

garrettmflynn avatar May 08 '24 15:05 garrettmflynn

D'you know any way that we can inspect output files from a workflow?

Like you mentioned in meeting, upload content saved to a file in CI as an artifact so it 'leaves' the CI, then we download it locally and look at it manually

CodyCBakerPhD avatar May 08 '24 17:05 CodyCBakerPhD

Have not done this in forever however; we used to use artifacts to the same effect as caches (3-4 years ago) but then GitHub added caches and then replaced it completely

CodyCBakerPhD avatar May 08 '24 17:05 CodyCBakerPhD

Gotcha, artifacts were really helpful to validate the problem here. inspect-page preview-page Screenshot 2024-05-08 at 11 02 59 AM

Essentially, the request to generate an inspector report and the attendant progress updates are not completed before the E2E tests move to the Neurosift page. This results in a timeout error on that page, making it difficult to determine the source if this issue.

To fix this, we just need to wait longer on the Inspect page.

Previously, we didn't check whether the report was properly generated—so the previous version of the PR where the environment had no tqdm_publisher installed would "pass". I've added a new test so that this will fail appropriately in the CI but still allow users of the GUIDE to advance if they encounter issues with the inspector report generation.

garrettmflynn avatar May 08 '24 18:05 garrettmflynn

Awesome, thanks for pinning that down

Looks like Windows and Mac-Intel might need to wait even longer? https://github.com/NeurodataWithoutBorders/nwb-guide/actions/runs/9006768849/job/24745047411?pr=676#step:10:3448

CodyCBakerPhD avatar May 08 '24 21:05 CodyCBakerPhD

Had to update how waiting for the NWB Inspector report is handled in https://github.com/NeurodataWithoutBorders/nwb-guide/pull/580. That should fix this when merged.

garrettmflynn avatar May 13 '24 20:05 garrettmflynn