aphrodite-engine
aphrodite-engine copied to clipboard
[Misc]: should we be using flashinfer for CUDA 12.1 or 12.4?
Anything you want to discuss about Aphrodite.
The Dockerfile uses the 12.1 wheel, but is based on a CUDA 12.4 image. I figured this is probably deliberate, but I just wanted to know if it matters whether I use the wheel for 12.1 or 12.4 if I'm installing manually. Thanks.
I believe code compiled on top of CUDA 12 works across all versions with different minor revisions. But we can change that to flashinfer's 12.4 wheels, if they have any.