flux-fp8-api
flux-fp8-api copied to clipboard
"ufunc_add_CUDA" not implemented for 'Float8_e4m3fn
my device is RTX 4090D, torch.version: 2.7.1+cu126
when i debug different ops for fp8:
math operations like: a*b a+b a-b a/b ab a//b a%b not implemented for 'Float8_e4m3fn'**
other operations like: torch.empty zeros zeros_like ones_like ,cat stack run ok
is there ant way to fix it, or do you know the reason, thanks~