Silver

Results 28 comments of Silver

> Yes, that makes sense, as mentioned above, this will be fixed. Sounds great. When can I expect this PR be merged?

I am facing the save issue with v0.8.0. Help needed.

> 在下载过程中又发现如下两个数据集有报错,请问可以麻烦再更新一下吗: > > 1. wikipedia数据集中的`https://huggingface.co/datasets/liwu/MNBVC/resolve/main/wiki/20230198/58.jsonl.gz`还有一个JSONDecodeError的问题 > 2. code_metadata数据集报错:`FileNotFoundError: Couldn't find file at https://huggingface.co/datasets/liwu/MNBVC/resolve/main/code/metadata/20230302/20000000-21000000.jsonl.gz` 已经修复了这两个问题

huggingface上已经上传的数据可以在数据集首页上查看到,具体的数据量可以看一下已经上传的数据文件。在huggingface上上传的文件是经过进一步清洗之后的数据。

> Faster version implemented in sglang https://lmsys.org/blog/2024-02-05-compressed-fsm/ Yeap, the RadixAttention attention proposed in this paper is also a nice feature to have if we want to constrain the decoded sequence...

I have encountered a similar problem. My backend server crashes when the request concurrency is high. I posted the scripts I used in this issue: https://github.com/triton-inference-server/tensorrtllm_backend/issues/392