simveit
simveit
## Motivation To evaluate reasoning models it makes sense to use difficult questions. This benchmark intends to use evaluate on the [LIMO](https://huggingface.co/datasets/GAIR/LIMO) dataset. The Qwen 1.5B distill archives 47% accuracy...
## Motivation We want to use Jupyter Notebooks for the frontend docs as discussed [here](https://github.com/sgl-project/sglang/issues/3330#issuecomment-2641925971). ## Modifications Implementation of frontend doc in Jupyter Notebook. ## Checklist - [ x] Format...
## Motivation This PR intends to update the docs on sampling parameters as suggested [here](https://github.com/sgl-project/sglang/issues/3165) ## Modifications New docs for sampling params ## Checklist - [ x] Format your code...
## Motivation Added section on installation project using `uv` projects.
## Motivation Move some docs into the backend part as suggested [here](https://github.com/sgl-project/sglang/issues/3262) ## Modifications Transferred docs into backend ## Checklist - [x ] Format your code according to the [Code...
## Motivation In [this PR](https://github.com/sgl-project/sglang/pull/3532) we introduces reasoning benchmark. We estimate $PASS@1 = \frac{1}{N_{question}}\sum_{i=1}^{N_{question}}\frac{1}{N_{tries}}\\sum_{j=1}^{N_{tries}}correct_{i,j}$ Where $correct_{i,j}$ is 1 if question i is correctly answered in try j. In this PR...
In their [R1 repo](https://github.com/deepseek-ai/DeepSeek-R1) deepseek people recommend to estimate PASS@1 by asking the same question various times. We implemented that into our [Reasoning benchmark](https://github.com/sgl-project/sglang/tree/main/benchmark/reasoning_benchmark). Additionaly to the averaged accuracy we...
## Motivation Rewrote frontend docs as jupyter notebook.
This PR adds an additional example to the `examples/gpu_functions` folder. I adjusted the project file and README accordingly and striped down the [original implementation](https://github.com/simveit/effective-reduction-mojo/blob/master/kernel/one_pass_4.mojo) to be more inline with the...
According to [PTX isa doc](https://docs.nvidia.com/cuda/parallel-thread-execution/#half-precision-floating-point-instructions-tanh) implemented half floating point instructions for `tanh` following `exp2` implementation