[Misc][Docs]: GCP and Kubernetes Terraform Deployment Modules
Pull Request Description
- Added terraform modules to spin up AIBrix stack on GCP.
- Separated into core GCP module and kubernetes module to promote reuse across clouds. The next module (AWS) will be much easier.
- Included full end-to-end testing script, which allocates the entire stack from nothing, performs a request against model endpoint using the OpenAI client (similar to core module e2e test), and then destroys all resources. This test is slow (~30m), so I do not think it should necessarily be run against PRs to main, but may be useful in guarding against regressions by running against the nightly build. Open to suggestions.
- Included docs for quickstart and runs of e2e test within GCP module.
Instructions for validating the PR HERE. I don't enjoy pushing such large PRs, but in this case it was necessary to build the functionality. Hopefully the instructions help speed up the validation process.
Let me know if you all think anything is missing or have suggestions/questions. Happy to help!
Related Issues
Resolves: #742
Important: Before submitting, please complete the description above and review the checklist below.
Contribution Guidelines (Expand for Details)
We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:
Pull Request Title Format
Your PR title should start with one of these prefixes to indicate the nature of the change:
[Bug]: Corrections to existing functionality[CI]: Changes to build process or CI pipeline[Docs]: Updates or additions to documentation[API]: Modifications to aibrix's API or interface[CLI]: Changes or additions to the Command Line Interface[Misc]: For changes not covered above (use sparingly)
Note: For changes spanning multiple categories, use multiple prefixes in order of importance.
Submission Checklist
- [ ] PR title includes appropriate prefix(es)
- [ ] Changes are clearly explained in the PR description
- [ ] New and existing tests pass successfully
- [ ] Code adheres to project style and best practices
- [ ] Documentation updated to reflect changes (if applicable)
- [ ] Thorough testing completed, no regressions introduced
By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.
Finally able to finish this up, @Jeffwan give it a try and let me know what you think.
@jolfr Thanks a lot for this work. I need some time to verify it and I will come back to you soon
Sorry for late. I was busy with some internal work and just get time to test it.
-
terraform init works fine
-
terraform plan shows some errors
seems node group issue, let me change it
My configuration issue, I incorrectly put us-central1-c to default_region Now it works
project_id = "..."
default_region = "us-central1"
I can launch the cluster successfully but won't be able to move to next steps.
Seem still my setting issue, I clean it up and rerun the apply and it works!
@jolfr Everything works perfect. thanks for the contribution. this is really awesome!