cloud11665 comments

Results 59 comments of


                                            cloud11665

opencl test_ceil / floor failing

Check out this thread on tests in general in the discord: https://discord.com/channels/1068976834382925865/1117201473596567682/1119787496809701521 ` ./tinygrad/lazy.py:PUSH_PERMUTES, PUSH_CONTIGUOUS = OPT>=3, OPT>=3`

Attempt to Fix `test_dtype.py` Failures on `CUDA` Runtime

This was mostly fixed in https://github.com/geohot/tinygrad/commit/2407690d821cfbb1747d5bf8088a4af3e5ac0769. As for the cast itself, we'd have to check if the instruction has the `.sat` modifier (`cvt.sat.s8.u8`)

Attempt to Fix `test_dtype.py` Failures on `CUDA` Runtime

@b7r6 I think this is a great learning experience, and I'd be happy to answer any questions about cuda / tinygrad on the discord!

(WIP) ptx in ci

``` ===================================================================== short test summary info ===================================================================== FAILED test/test_dtype.py::TestHalfDtype::test_half_matmul - pycuda._driver.LogicError: cuModuleLoadDataEx failed: a PTX JIT compilation failed - ptxas application ptx input, line 27; error : Unexpected instruction ty......

add ptx formatter + syntax highlighter

oh, so we don't want to format it (fix indentation on args and instructions, add newlines for params) ?

add ptx formatter + syntax highlighter

> I'd merge the colorizer (if it's clear from code it's just a colorizer), I think the formatting is fine from what I've seen. The risk is that it misprints...

add openmp for clang backend

![image](https://github.com/tinygrad/tinygrad/assets/59028866/360696dd-c995-477d-af04-dd30d610607b) llama

add openmp for clang backend

![image](https://github.com/tinygrad/tinygrad/assets/59028866/36b7eab1-c1fb-48fc-841a-5e883ce8e880) I've added parallelization of all global+local loops as there were cases where we were missing out on performance due to n_cores > loop_idx

RTX Video Enhancement support

I can take a look at this after getting access to a 40 series GPU

RTX Video Enhancement support

I currently have a 2080, and I will have access to a 4070 in ~2-3 weeks