cloud11665
cloud11665
Superb work! I myself have been thinking about integrating SDF based fonts into my application. My idea was to use https://github.com/Chlumsky/msdfgen and just prompt the user with "generating fonts" modal...
yea. I agree about the golfing
Check out [docs/env_vars.md](https://github.com/geohot/tinygrad/blob/master/docs/env_vars.md). the CPU env var means that it just runs on the cpu and not the gpu (default)
 I have compiled a fully static library for executing ptx on the cpu. No need to mess with building gpuocelot
There is https://github.com/actions/cache which could be useful, as building gpuocelot requires building llvm, and we wouldn't want to do that every commit.
kk, a simple python wrapper is done: ```py kernel = r""" .version 7.5 .target sm_35 .address_size 64 // .globl _Z4E_16Pf .visible .entry _Z4E_16Pf( .param .u64 _Z4E_16Pf_param_0 ) { .reg .b32...
I'm experimenting in docker, and the longest step is downloading the cuda toolkit. Also, I will not be forking gpuocelot, and just maintain a patch in the tinygrad repo
> Patch is fine. How big is CUDA? If it's huge (>200MB), maybe we figure out how to cache it. All apt packages (cuda included) are ~8000MiB. It only takes...
but we do all of that only for the libcudacpu.so file, so caching that would save 3-4 mins
Tested locally and it worked, but there are 2 more things I want to add: - a non_stable mode for newer nvcc version (compiling for sm_50 and hoping it works...