GithubX-F comments

Results 5 comments of


                                            GithubX-F

How to run pre-training?

same question. Are there any examples provided?

On the performance of token classification

> 一个快速修复方法是使用以下命令加载标记器： > > ```python > from transformers import RobertaTokenizerFast > > tokenizer = RobertaTokenizerFast.from_pretrained("answerdotai/ModernBERT-large", add_prefix_space=True) > ``` > > 通过我的 Flair 基准测试，我现在能够在开发集上获得 96.24%，在测试集上获得 92.01%。我目前正在使用不同的超参数进行更多运行 :) Compared to the original...

MaskedLM nan training loss

Similarly, I encountered a similar issue where my loss was always 0, but when I replaced it with the BERT model, the loss decreased normally.

MaskedLM nan training loss

You can try FlashAttention2, for example: ```python AutoModelForMaskedLM.from_pretrained( MODEL_PATH, trust_remote_code=True, ignore_mismatched_sizes=True, torch_dtype=torch.bfloat16 if is_torch_bf16_gpu_available() else torch.float16, attn_implementation=ATTN_IMPLEMENTATION ).to("cuda" if torch.cuda.is_available() else "cpu") ```

funasr离线CPU版本新部署开始报ModuleNotFoundError: No module named 'zoneinfo'

> ``` > pip install modelscope==1.29.0 > ``` > > 我把模型范围仪版本回滚到 1.29.0，现在可以用了。希望这能帮到你。 This works. Good luck!