Tomorrowdawn issues

Results 7 issues of


                                            Tomorrowdawn

Publish Markdown渲染正常,上传网页后出错

[出错页面](https://yuudawnlight.com/mathematics-docs-1) 在本地的publish markdown上正常渲染. [这是本地截图](https://i.853tv.cn/imgs/2021/02/4640bf65f9a6c8b9.png) 非常感谢您的软件,但是能否请您修复/指明该错误?

Run_evaluation doesn't change user when use full dataset

it seems run_evalution only resets before `for batch in loader`. If batch['user'] changes, memory returned by env would be invalid.

[Question] How does lightllm implement nopad batching?

Thanks for your great work! Here are my concerns: Say we get a batch of inputs with lengths L1,L2,... How to simultaneously compute the attention scores of these inputs by...

Sink Cache Attention Scores are strange. CausalMask seems not working.

### System Info - `transformers` version: 4.41.0 - Platform: Linux-5.15.0-67-generic-x86_64-with-glibc2.31 - Python version: 3.10.13 - Huggingface_hub version: 0.23.0 - Safetensors version: 0.4.2 - Accelerate version: 0.27.2 - Accelerate config: -...

No support of GQA of Llama in real_drop

In [modify_llama.py](https://github.com/FMInference/H2O/blob/main/h2o_hf/utils_real_drop/modify_llama.py), the hh_score of H2OCache is computed by attn_scores.sum(0).sum(1), resulting in a shape of [num_heads, hidden_dim]. However, in Llama's GQA implementation(just in the same file), the k/v cache has...

README中所声称的协议与LICENSE不一致, 并且指向原协议的链接已失效

README中声称使用了木兰宽松, 但是LICENSE是CCA4 网址截图如下 https://license.coscl.org.cn/MulanPSL2/ ![image](https://github.com/user-attachments/assets/5c24c749-d14f-4b50-b246-b4bd6ddcc776)

[Question] Why d2t = [target_token_ids] - torch.arange(len)?

https://github.com/sgl-project/SpecForge/blob/d3472dde5d6828e60e7ee766ded74754e5dc6778/specforge/data/preprocessing.py#L588 I find it extremely strange that d2t doesn't store the direct mapping of [target_token_ids], instead, it stores [target_token_ids] - torch.arange(len). What's the purpose of this offset?