萧停云 issues

Results 5 issues of


                                            萧停云

paddlehub载入u2_conformer_wenetspeech失败

### 请提出你的问题 Please ask your question ![image](https://user-images.githubusercontent.com/51204375/172295697-5e5b86a0-25b5-4b30-8fc8-0a7a2bbd456a.png)

status/new-issue

type/question

最后一个batch的数据处理卡住

if not data_args.streaming: lm_datasets = tokenized_datasets.map( group_texts, batched=True, batch_size=group_batch_size, num_proc=data_args.preprocessing_num_workers, load_from_cache_file=not data_args.overwrite_cache, desc=f"Grouping texts in chunks of {block_size}", ) funetuner.py中group_texts方法，在处理最后一个batch的时候卡住，进度条一直停在百分之90多 ![image](https://user-images.githubusercontent.com/51204375/230752796-3d2993a0-fed9-47fc-a1d4-3ee3ff2622bd.png)

萧停云

paddlehub载入u2_conformer_wenetspeech失败

最后一个batch的数据处理卡住

RuntimeError: Error building extension 'lightseq_layers_new'

关于IDEA-CCNL/Wenzhong2.0-GPT2-3.5B-chinese数据集格式的疑惑

Which method of constructing the index