cloudhan

Results 215 comments of cloudhan

Here is some build artifact on Windows [dance-0.5.15-pre1.vsix.zip](https://github.com/user-attachments/files/19141823/dance-0.5.15-pre1.vsix.zip) [dance-helix-keybindings-0.5.15-pre1.vsix.zip](https://github.com/user-attachments/files/19141821/dance-helix-keybindings-0.5.15-pre1.vsix.zip) extract and install if someone want to test it out.

[ort-benchmarks.zip](https://github.com/user-attachments/files/16798405/ort-benchmarks.zip) ## batch decode called into paged_attention_kernel ### H100 SMX llama3 70B (GQA8) page_size=16 ``` | num_seqs | seq_len | num_heads | num_kv_heads | head_size | page_size | Read |...

## FA2 - paged batch decode [benchmark_flash_attention.zip](https://github.com/user-attachments/files/16798208/benchmark_flash_attention.zip) ### H100 SMX llama3 70B (GQA8) page_size=256 ``` | num_seqs | seq_len | num_heads | num_kv_heads | head_size | page_size | Read |...

Mathematically, a Layout is a fucntion mapping from integer to integer (1d logical index to 1d logical index). A Layout is a tuple of Shape and Stride. The Shape is...

This is probably cause by the environment (hardware and software) of engine file generation and execution is not the same. The user encounter this better try to delete the engine...

Better use WSL at the moment to save your time.

CI test revealed something like the following ``` kw = {} @wraps(func) def standalone_func(*a, **kw): > return func(*(a + p.args), **p.kwargs, **kw) .local/lib/python3.9/site-packages/parameterized/parameterized.py:620: _ _ _ _ _ _ _...

@snnn need an es approve. The some packages in CI are updated due to some nan and inf are produced from the reference impl, see my previous comment.

You can use IIFE pattern to workaround. ```cpp Tensor s = [&]() { auto s = ...... return s; }(); ```

Modify Events cause Linear mouse malfunction...