aibrix icon indicating copy to clipboard operation
aibrix copied to clipboard

[Misc][Docs]: GCP and Kubernetes Terraform Deployment Modules

Open jolfr opened this issue 9 months ago • 2 comments

Pull Request Description

  • Added terraform modules to spin up AIBrix stack on GCP.
  • Separated into core GCP module and kubernetes module to promote reuse across clouds. The next module (AWS) will be much easier.
  • Included full end-to-end testing script, which allocates the entire stack from nothing, performs a request against model endpoint using the OpenAI client (similar to core module e2e test), and then destroys all resources. This test is slow (~30m), so I do not think it should necessarily be run against PRs to main, but may be useful in guarding against regressions by running against the nightly build. Open to suggestions.
  • Included docs for quickstart and runs of e2e test within GCP module.

Instructions for validating the PR HERE. I don't enjoy pushing such large PRs, but in this case it was necessary to build the functionality. Hopefully the instructions help speed up the validation process.

Let me know if you all think anything is missing or have suggestions/questions. Happy to help!

Related Issues

Resolves: #742

Important: Before submitting, please complete the description above and review the checklist below.


Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

  • [Bug]: Corrections to existing functionality
  • [CI]: Changes to build process or CI pipeline
  • [Docs]: Updates or additions to documentation
  • [API]: Modifications to aibrix's API or interface
  • [CLI]: Changes or additions to the Command Line Interface
  • [Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

  • [ ] PR title includes appropriate prefix(es)
  • [ ] Changes are clearly explained in the PR description
  • [ ] New and existing tests pass successfully
  • [ ] Code adheres to project style and best practices
  • [ ] Documentation updated to reflect changes (if applicable)
  • [ ] Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

jolfr avatar Mar 07 '25 23:03 jolfr

Finally able to finish this up, @Jeffwan give it a try and let me know what you think.

jolfr avatar Mar 07 '25 23:03 jolfr

@jolfr Thanks a lot for this work. I need some time to verify it and I will come back to you soon

Jeffwan avatar Mar 11 '25 13:03 Jeffwan

Sorry for late. I was busy with some internal work and just get time to test it.

  1. terraform init works fine image

  2. terraform plan shows some errors image seems node group issue, let me change it


My configuration issue, I incorrectly put us-central1-c to default_region Now it works

project_id     = "..."
default_region = "us-central1"

Jeffwan avatar Mar 24 '25 23:03 Jeffwan

I can launch the cluster successfully but won't be able to move to next steps.

image image

Seem still my setting issue, I clean it up and rerun the apply and it works!

image

Jeffwan avatar Mar 25 '25 00:03 Jeffwan

@jolfr Everything works perfect. thanks for the contribution. this is really awesome!

Jeffwan avatar Mar 25 '25 01:03 Jeffwan