nebari icon indicating copy to clipboard operation
nebari copied to clipboard

Address issue with AWS instance type schema

Open viniciusdc opened this issue 1 year ago • 3 comments

Reference Issues or PRs

closes #2782

What does this implement/fix?

Put a x in the boxes that apply

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds a feature)
  • [ ] Breaking change (fix or feature that would cause existing features not to work as expected)
  • [ ] Documentation Update
  • [ ] Code style update (formatting, renaming)
  • [ ] Refactoring (no functional changes, no API changes)
  • [ ] Build related changes
  • [ ] Other (please describe):

Testing

  • [ ] Did you test the pull request locally?
  • [ ] Did you add new tests?

How to test this PR?

As the previous error happened only during terraform plan you will need to mock the deployment of nebari, I suggest mofiying the source code to only plan the code instead of deploying (this should work for the second stage, since there are no pre-variable depency AFAIK) -- stage 1 can be skipped by using the state local instead of remote.

  • Init an AWS deployment, e.g, nebari init aws --project testing-gpu
  • add a GPU node block in your config, I have an example in the linked issue;
  • with the changes mentioned above, trigger nebari deploy so that it only plans the stage,
  • You need to validate that the actual instance type is respected when the GPU block is passed, i.e, if the node group does not have gpu: enabled you should see AL2_X86_64 if gpu is passed, then its variant should show up ``AL2_X86_64_GPU`
  • As a final validation, make sure to check the node_template block and test if by passing an AMI the intance_type also changes to CUSTOM.

Any other comments?

I made a deployment on AWS from this branch and GPUs were working as expected

viniciusdc avatar Oct 22 '24 15:10 viniciusdc

@viniciusdc can you fill out the how to test section here?

dcmcand avatar Oct 24 '24 13:10 dcmcand

done. Thanks for catching it!

viniciusdc avatar Oct 24 '24 14:10 viniciusdc

@dcmcand This will address the issue you saw. Could you have another look? (Let me know if you see any other pydantic errors; they might show up again.)

viniciusdc avatar Oct 25 '24 15:10 viniciusdc