Billy Cao comments

Results 302 comments of


                                            Billy Cao

streaming 训练卡在第一个step

你的batch size太大了，这和streaming与否没关系。我的pr解决的是卡第一个step1的问题

streaming 训练卡在第一个step

你的单卡显存就是不够那么大的context length，缩小后超过最大token是个常见的妥协。要不然你就试试qlora，但是会更慢

streaming 训练卡在第一个step

> 在取一个batch的时候，会处理远超一个batch对应的data 你的buffer_size和preprocessing_batch_size设的是什么？试试把buffer_size设成global batch size，preprocessing设成1 另外把TOKENIZERS_PARALLELISM=0加上

Related to https://github.com/huggingface/transformers/pull/31342 but I dont quite get your changes - what exactly does it fix? When I tested all the pipelines in fp16 none of them had issues outputting...

Dataset availability

This isnt the dataset but weights.

wish to support lx-music

There is no plan for such integration now.

easyocr is useless in demo, why use it?

I agree that it is not the best OCR solution, and also often doesn't get the text right for me. I recommend PaddleOCR https://github.com/PaddlePaddle/PaddleOCR/blob/main/README_en.md EDIT: add some comparison using my...