gradio
gradio copied to clipboard
fix-tests
Description
Please include a concise summary, in clear English, of the changes in this pull request. If it closes an issue, please mention it here.
Closes: #(issue)
🎯 PRs Should Target Issues
Before your create a PR, please check to see if there is an existing issue for this change. If not, please create an issue before you create this PR, unless the fix is very small.
Not adhering to this guideline will result in the PR being closed.
Tests
-
PRs will only be merged if tests pass on CI. To run the tests locally, please set up your Gradio environment locally and run the tests:
bash scripts/run_all_tests.sh -
You may need to run the linters:
bash scripts/format_backend.shandbash scripts/format_frontend.sh
🪼 branch checks and previews
| • | Name | Status | URL |
|---|---|---|---|
| Spaces | Spaces preview | ||
| Website | Website preview | ||
| Storybook | Details | ||
| :unicorn: | Changes |
Install Gradio from this PR
pip install https://gradio-builds.s3.amazonaws.com/cc76deb08dc62e6a90e978340839f0be4a9d83b2/gradio-4.19.2-py3-none-any.whl
Install Gradio Python Client from this PR
pip install "gradio-client @ git+https://github.com/gradio-app/gradio@cc76deb08dc62e6a90e978340839f0be4a9d83b2#subdirectory=client/python"
🦄 change detected
This Pull Request includes changes to the following packages.
| Package | Version |
|---|---|
@gradio/app |
patch |
@gradio/lite |
patch |
@gradio/tootils |
patch |
@gradio/wasm |
patch |
gradio |
patch |
- [ ] Maintainers can select this checkbox to manually select packages to update.
With the following changelog entry.
fix-tests
Maintainers or the PR author can modify the PR title to modify this entry.
Something isn't right?
- Maintainers can change the version label to modify the version bump.
- If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can update the changelog file directly.
I think Lite's E2E tests should be run on a separate CI job because the functional tests take a long time and randomly fail.
I think Lite's E2E tests should be run on a separate CI job because the functional tests take a long time and randomly fail.
This is something we should fix rather than adding complexity to the Ci to workaround. I am aware of that failing test and looking into it.
That's correct about the random failures, but parallel execution is a right solution for long-running tests in general, not a workaround. It's not complex.
Anyway, the test passed after several trials 👍 .
The job failed because the do check step fails as it refers to the changes job which is temporarily removed in this PR.
That's correct about the random failures, but parallel execution is a right solution for long-running tests in general, not a workaround. It's not complex.
Resources are finite, executing them in parallel is only a surefire win if you have spare resources, we don't. We have to wait for runners to be available before additional CI runs across the repo can be started. So we might gain some time if the CI all starts at the same time because each run will be slightly shorter (it will be pretty slight). However our CI time overall will be longer because we will be duplicating expensive work, as cloning and install is a substantial chunk of our CI time.
Functionally these tests also belong together as well, they are all functional tests and it is easier to look in a single place for failures rather than needing to jump across workflows. If the setup time was faster then it might be worth considering but until that is the case it doesn't really have any benefits and makes debugging CI failures more difficult.
Resources are finite
I see, it makes sense.
each run will be slightly shorter (it will be pretty slight).
This part is not fair. The most time consuming steps are installing deps, the normal browser tests, and the Lite browser tests, and each takes about 1-2 mins, which are dominant (see https://github.com/gradio-app/gradio/actions/runs/7988974242/job/21819294081). So parallelizing them could reduce the time ~2-30%, which is not slight. I admit other parts of your discussion make sense mentioning the per-job vs global resource efficiency.
Functionally these tests also belong together as well,
I know, I thought the parallel exec would be multiple jobs in a single workflow. Or are there any policy to keep the number of jobs in each workflow to 1? I think it doesn't introduce much complexity - it's just usual level, at least less complex than what's existing in the current workflows.
Commit 32154f3 to trigger the CI using the changes in this PR. Will revert it after confirming it works.
Update: Reverted. It worked well (https://github.com/gradio-app/gradio/actions/runs/7991559044/job/21823007428).
@pngwn Ok, done.
If it's ok, revert 95b2ee6 before merging.
Thanks!