Nicolas Patry

Results 978 comments of Nicolas Patry

What cards are you guys having ? We need `compute_cap>7.0` for it to work. I know compute_cap 5.2 does trigger similar fails. The core kernels we have use f16 and...

@krolinventions Perfectly understand. My own GTX 970 is too old to run `candle` atm. However, in order to deliver fast we had to cut corners in that department. Currently I...

Do try, it's not as daunting as it looks (it's daunting when you want the best possible performance). Feel free to join the Discord HF on channel candle to pursue...

Yes Windows seems to be having issues. I've been told in discord WSL is ok.

@dbrowne Go to `candle/candle-kernels/src/` And try to make the `.cu` compile: ``` nvcc --ptx --gpu-architecture=sm_61 affine.cu -I. ``` Most of the logic should be in `compatibility.cuh`. 61 should be easier...

Can you take my PR out for a spin ? https://github.com/huggingface/candle/pull/386 It fixes compilation but it still doesn't work on my 52 because the ops are still not there. However...

@n8henrie This is far from optimized yet ;). We ran a few passes, but there's still a lot more that can be done

Does it work now on main ? I made fixes for older cards (still far from universal support but should be much better)

@bayedieng @theHausdorffMetric ``` compatibility.cuh(11): error: identifier "__hmax" is undefined ``` Yes this means cuda 11.5 doesn't have this function, therefore the compat layer doesn't work. Upgrading cuda should help, at...