Tomorrowdawn

Results 7 issues of Tomorrowdawn

[出错页面](https://yuudawnlight.com/mathematics-docs-1) 在本地的publish markdown上正常渲染. [这是本地截图](https://i.853tv.cn/imgs/2021/02/4640bf65f9a6c8b9.png) 非常感谢您的软件,但是能否请您修复/指明该错误?

it seems run_evalution only resets before `for batch in loader`. If batch['user'] changes, memory returned by env would be invalid.

Thanks for your great work! Here are my concerns: Say we get a batch of inputs with lengths L1,L2,... How to simultaneously compute the attention scores of these inputs by...

### System Info - `transformers` version: 4.41.0 - Platform: Linux-5.15.0-67-generic-x86_64-with-glibc2.31 - Python version: 3.10.13 - Huggingface_hub version: 0.23.0 - Safetensors version: 0.4.2 - Accelerate version: 0.27.2 - Accelerate config: -...

In [modify_llama.py](https://github.com/FMInference/H2O/blob/main/h2o_hf/utils_real_drop/modify_llama.py), the hh_score of H2OCache is computed by attn_scores.sum(0).sum(1), resulting in a shape of [num_heads, hidden_dim]. However, in Llama's GQA implementation(just in the same file), the k/v cache has...

README中声称使用了木兰宽松, 但是LICENSE是CCA4 网址截图如下 https://license.coscl.org.cn/MulanPSL2/ ![image](https://github.com/user-attachments/assets/5c24c749-d14f-4b50-b246-b4bd6ddcc776)

https://github.com/sgl-project/SpecForge/blob/d3472dde5d6828e60e7ee766ded74754e5dc6778/specforge/data/preprocessing.py#L588 I find it extremely strange that d2t doesn't store the direct mapping of [target_token_ids], instead, it stores [target_token_ids] - torch.arange(len). What's the purpose of this offset?