nimlgen issues

Repositories
Issues
Comments

Results 31 issues of


                                            nimlgen

nv turing support

[WIP] Start am driver

Current "plan": - [x] map gpu - [x] boot gpu (load fw + power management) - [ ] (not planned anymore) interrupts (was exploring this, but seems a minimal kernel...

NV vs CUDA fuzz diffs

- [ ] CUDA requests more `target_sm_config_shared_mem_size` (and the same for minimum). NV also never uses option 0x5, since for most kernel is was a lost slower. - [x] Match...

nv fuzz vs cuda in ci

qcom faster copies

sync: 13.08 ms @ 5.13 GB/s vs sync: 8.98 ms @ 7.47 GB/s

Windows cuda

test beam bert oom on green

Should be passed nicer somehow, but let's see if that works

hcq calculate offset based on commands

add lin failure 58

from #7727

Tensor(numpy).realize() takes 0.85ms to schedule

`Tensor(numpy).realize()` takes ~ 0.85ms to schedule on comma. There are 5 of them, each 0.85ms. The reason why benchmark is slow after #7593. QCOM copies take: ``` copyin 0.02 ms...