Add Support for Google Cloud A4 and A4X Machine Types
Summary
This PR adds support for the newly released Google Cloud Compute Engine machine types A4 and A4X, along with their associated NVIDIA GPU models (B200, GB200, and H200).
Motivation
Google Cloud has recently announced the A4 and A4X machine series featuring the latest NVIDIA Blackwell GPU architecture. These new accelerator-optimized machine types are designed for foundation model training and serving, representing a significant advancement in AI/ML compute capabilities.
Reference: https://cloud.google.com/compute/docs/gpus/
Changes Made
New Machine Type Configurations
1. A4 Machine Series (instances/series/a4.sql)
- Family: Accelerator-optimized
- GPU: NVIDIA B200 Blackwell GPUs
- CPU Platform: Sapphire Rapids
- Local SSD: 12,000 GiB
- Network Bandwidth: 3,600 Gbps
- Spot VM Support: Enabled
-
Machine Type:
a4-highgpu-8g- 224 vCPUs
- 3,968 GB memory
- 8x NVIDIA B200 GPUs (1,440 GB total GPU memory)
2. A4X Machine Series (instances/series/a4x.sql)
- Family: Accelerator-optimized
- GPU: NVIDIA GB200 Grace Blackwell Superchips
- CPU Platform: ARM Neoverse V2
- Local SSD: 12,000 GiB
- Network Bandwidth: 2,000 Gbps
- ARM Architecture: Supported
- Spot VM Support: Enabled
-
Machine Type:
a4x-highgpu-4g- 140 vCPUs
- 884 GB memory
- 4x NVIDIA GB200 GPUs (720 GB total GPU memory)
GPU Model Support
Added support for the following NVIDIA GPU models in instances/series/gpu/gpu_names.sql:
-
NVIDIA H200 141GB (
nvidia-h200-141gb) - Used in A3 Ultra -
NVIDIA B200 (
nvidia-b200) - Used in A4 -
NVIDIA GB200 (
nvidia-gb200) - Used in A4X
Documentation Updates
Updated instances/README.md to:
- Add A4 and A4X to the machine types list
- Fix A3 link (was incorrectly pointing to
a2.sql) - Update resources section to reference A3, A4, and A4X accelerator-optimized machines
Testing
All SQL files follow the existing project patterns and schema:
- Consistent formatting with existing machine type configurations
- Proper series and family classification
- Accurate specifications from official Google Cloud documentation
References
- Google Cloud GPU Machine Types
- A4 Machine Series
- A4X Machine Series
- NVIDIA B200 GPUs
- NVIDIA GB200 NVL72
Checklist
- [x] Created new SQL configuration files for A4 and A4X machine types
- [x] Updated GPU names mapping for new NVIDIA models
- [x] Updated documentation to reflect new machine types
- [x] Followed existing code style and patterns
- [x] All changes are based on official Google Cloud documentation
- [x] Clear and descriptive commit messages
Additional Notes
These machine types represent Google Cloud's latest offerings for AI/ML workloads:
- A4 is optimized for foundation model training and serving with NVIDIA B200 GPUs
- A4X features GB200 Grace Blackwell Superchips combining ARM CPUs with B200 GPUs for exascale AI computing
Both machine types require capacity reservation or specific provisioning methods as outlined in the Google Cloud documentation.
Thanks for the pull. As I understand it, you can only get A4 and A3 if you are special activated and have a separate contract. Am I right? How can we calculate the list price?
Please see: https://github.com/Cyclenerd/google-cloud-pricing-cost-calculator/issues/279 and https://github.com/Cyclenerd/google-cloud-pricing-cost-calculator/issues/309
Note: The machine type a4x-highgpu-4g is not published via the Google Compute API atm.