[Extension Request] Extension for Edge Machine Learning Accelerators
Hello!
I'll start off by thanking you guys for the amazing software you all have created. It has completely changed the game for me!
Stated directly, I am requesting an added extension to support Coral TPU modules. I am currently working on a distributed computing research project and would like to integrate specialized edge ML accelerators into my cluster. My project involves running ML training and inference workloads across multiple nodes, and leveraging these accelerators would significantly improve performance while keeping the cost relatively low.
What would be needed to make something like this possible?
Thank you!
We can't say that we have experience with Coral TPUs ourselves, but most usually this involves two pieces:
- kernel driver support
- some form of container runtime support if you want to share a TPU on a single machine across multiple workloads
Kernel support might come in two flavors:
- Linux upstream already supports it, so we enable the drivers in the configuration to be built as modules and re-package them in this repo as extensions (e.g. AMD GPU)
- out-of-tree kernel module should be built (e.g. NVIDIA drivers), and then same they are repackaged in this repo.
Either way, the actual driver build should go to pkgs repo, and container runtime probably here.
Ok! That seems like something I'm interested in looking deeper into. How doable does that seem?
How doable does that seem?
I don't have answer to this question.
Ok, no worries! Who should I talk to to get something like this built?
Who should I talk to to get something like this built?
You have three options:
- build it yourself
- wait for someone from the community to build it
- reach out to Sidero Labs and contract us to have this implemented
I was speaking with a representative about the third option earlier this week. I asked about an estimated price for sponsoring a project, but they felt like they didn't have enough technical knowledge to provide one. Do you have an idea of what range it could possibly be in?
I think you need to reach out to that representative.
Ok, thank you for your help!
Wouldn't the gasket drivers work for that?
That's a good question actually! I'm not sure. What would be your recommendations for how to implement them?
https://github.com/siderolabs/extensions/tree/main?tab=readme-ov-file#drivers
gasket ghcr.io/siderolabs/gasket-driver Driver for Google Coral PCIe devices gasket driver upstream short commit-talos version
Those might be what you need, or could be a good starting point if the usb version needs a similar driver
Perfect, thank you!
This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been stalled for 7 days with no activity.