flytekit icon indicating copy to clipboard operation
flytekit copied to clipboard

Grantham/add-attach_shm-template

Open granthamtaylor opened this issue 1 year ago • 5 comments

Why are the changes needed?

Adding SHM is a necessity for multi-GPU ML training workloads. It is currently not immediately obvious how to do this.

What changes were proposed in this pull request?

This PR simply adds a convenience function to generate a PodTemplate configured to attach SHM to a task.

Additionally, this PR adds a directory for future contributions around similar PodTemplate wrappers in the future.

How was this patch tested?

I have used this function for my workflows to attach SHM.

More tests to be added soon.

Check all the applicable boxes

  • [ ] I updated the documentation accordingly.
  • [ ] All new and existing tests passed.
  • [ ] All commits are signed-off.

Summary by Bito

This PR implements and validates shared memory (SHM) attachment functionality for multi-GPU ML training workloads. It introduces a pod_templates package with attach_shm utility function for configuring shared memory in tasks. The implementation includes core functionality and comprehensive testing suite that verifies SHM pod template properties including name, size (5Gi), and proper template attachment.

Unit tests added: True

Estimated effort to review (1-5, lower is better): 1

granthamtaylor avatar Dec 26 '24 16:12 granthamtaylor