determined icon indicating copy to clipboard operation
determined copied to clipboard

chore: test AMP autocast and gradient scaling [DET-7885]

Open drh-determined-ai opened this issue 2 years ago • 4 comments

Description

Test Plan

Commentary (optional)

Checklist

  • [ ] User-facing API changes need the "User-facing API Change" label.
  • [ ] Release notes should be added as a separate file under docs/release-notes/. See Release Note for details.
  • [ ] Licenses should be included for new code which was copied and/or modified from any external code.
  • [ ] If modifying /webui/react/src/shared/ verify make -C webui/react test-shared passes.

drh-determined-ai avatar Aug 02 '22 21:08 drh-determined-ai

Deploy Preview for determined-ui ready!

Name Link
Latest commit 3cd036f960637ff8eb3e7a960be684aecb883e1d
Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/630940b472e18700084bf457
Deploy Preview https://deploy-preview-4702--determined-ui.netlify.app/
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

netlify[bot] avatar Aug 02 '22 21:08 netlify[bot]

i can't comment on the file directly, but remember to remove harness/tests/experiment/fixtures/.pytorch_onevar_model.py.swp

azhou-determined avatar Aug 08 '22 20:08 azhou-determined

Anyone know anything about why the package-and-push-system-local job in test-e2e is failing? I haven't encountered any problems with this check before, and I don't know what I could have changed to cause it...

drh-determined-ai avatar Aug 10 '22 15:08 drh-determined-ai

Curiously, TestPyTorchTrial::test_custom_eval is failing on my GCP machine (but not on my laptop or with CircleCI). The assertion original["loss"] == custom_eval["loss"] is incorrect at the order of 1e-7. We could replace statements like this with isclose or similar calls, but without understanding the issue better I think that'd be very premature. Since things are ok on CircleCI I'm ok with putting a pin in this. I wanted to make note of it now, though.

drh-determined-ai avatar Aug 10 '22 20:08 drh-determined-ai

Deploy Preview for storybook-det ready!

Name Link
Latest commit 3cd036f960637ff8eb3e7a960be684aecb883e1d
Latest deploy log https://app.netlify.com/sites/storybook-det/deploys/630940b4477a3900088ce2eb
Deploy Preview https://deploy-preview-4702--storybook-det.netlify.app/
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

netlify[bot] avatar Aug 26 '22 14:08 netlify[bot]

I want to highlight the use of Apex's optimization level O2. Although O1 is recommended for typical use, it is not appropriate in these kinds of tests because it patches torch functions, which screws up later tests. Thank goodness for version control!

I also want to remind us that these (and other) tests that depend on GPU(s) are not being run by CircleCI. I let the bug just mentioned slip by for so long because I had forgotten this!

drh-determined-ai avatar Aug 26 '22 19:08 drh-determined-ai