cloud11665 issues

Results 16 issues of


                                            cloud11665

add ptx formatter + syntax highlighter

Also doing formatting as it's easier to compare ptx to cuda backend code. orange - instruction or immediate value purple - state space blue - operand or identifier green -...

add openmp for clang backend

I will post llama timing benchmarks soon.

HTTP error 521

Looks like https://bangplayer.live is down

NV=1 doesn't respect CUDA_VISIBLE_DEVICES

repro steps (on a 2x4090 machine) `CUDA_VISIBLE_DEVICES=1 NV=1 DEBUG=1 python3 -m examples.hlb_cifar10` -> gpu0 gets loaded `CUDA_VISIBLE_DEVICES=1 CUDA=1 DEBUG=1 python3 -m examples.hlb_cifar10` -> gpu1 gets loaded

Bug: non-chat completions not respecting the max_tokens parameter using the OpenAI api

### What happened? The limit is respected when requesting a chat completion, but for non-chat ones, the model keeps generating tokens forever (until ctx-len is reached). With non-streaming there is...

bug-unconfirmed

high severity

AMD 7950x3D Memory controller showing dual-rank DDR5 DIMMs as single-rank with half the capacity

output of `corefreq-cli -k -n -B -n -M` ``` Linux: |- Release [6.8.0-57-generic] |- Version [#59-Ubuntu SMP PREEMPT_DYNAMIC Sat Mar 15 17:40:59 UTC 2025] |- Machine [x86_64] Memory: |- Total...

bugfix