FlyCarrot comments

Results 13 comments of


                                            FlyCarrot

Does it support lora and pipeline parallel now?

> We tested up to OPT-175B and the current framework works well. We will discuss internally about the support of PP and TP :) Hi, can you give more infos...

无法正常显示的字符编码

借楼问下编码相关的问题，最近在看里面的一些内容，发现部分文档虽然是.txt格式，但是直接open读取的时候，解码不管是utf-8还是gb2312都会失败，这个有考虑过统一格式吗？也许是我打开方式不对，还请指点一下。

> 内存是 1.3T，trainer 大致是这样，内存的 peak 一般是在 merge 时候出现么 > > ```python > trainer = Trainer(order=6, max_vocab_size=80000, min_count=32, isolate_digits=True) > trainer.train(corpus_instance, workers=128, batch_size=2000) > ``` > > Edit 观察到卡住之后这个配置下，used mem 大致是...

Does Xtuner supports Qwen3 series？

用源码安装的模式，原先的环境可以跑的话，自己升级一下，先把transformers升级了，然后升级torch，最后重新装一下flash-attn，然后模型路径换了就可以跑了。剩下的差距不大。如果有报错解不了的话，在 `xtuner/entry_point.py` 里把 try except的部分加一个 traceback，错误堆栈都打出来就知道哪里对不上了，方便定位，原先的 print(e) 太简略了，找不到真正的错误信息。

[Bug]: Qwen2.5-72B-instruct使用vllm部署通过函数调用输出的结果里所有汉字被转义了

hi，也在关注函数调用这个功能，可以交流交流，在我的场景里，输出内容会被转到unicode字符，而且content里有大量的无用信息。模型是Qwen2.5-72B-Instruct，模版使用了这里提供的 https://github.com/QwenLM/Qwen2.5/issues/918#issuecomment-2360141148 例如（这里没加入英文的system Prompt，如果加上以后输出的函数内信息会变成英文如Beijing）： ``` json { "role": "user", "content": "北京今天的天气" }, { "content": "为了给您提供北京今天的天气信息，我需要先查询当前的温度。请问您需要摄氏度还是华氏度作为温度单位呢？\n如果没特别需求，默认我将使用摄氏度。\n如果您还需要其他天气信息（如湿度、风速等），也请告诉我。不过目前我只能提供温度信息，其他信息可能需要后续支持。\n现在，我将先查询北京今天的温度。\n", "refusal": null, "role": "assistant", "audio": null, "function_call": null, "tool_calls": [ { "id": "chatcmpl-tool-aefa77992b934f478d2fed7ca6b192d0",...

[Bug] MBPP evaluator cannot extract the correct anwser

> Got it, I think the prompt is designed for base model and we may need to upgrade the prompt compatible with instruct model. hello, has this bug been fixed?

[Bug]: Can't use yarn rope config for long context in Qwen2 model

I think this is link maybe help. https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct/discussions/15#670a76a4e3d216b4243b2d4d In short, I Guess : vllm nedd original_max_position_embeddings but transformers do not allow this key. So, anybody knows how to balance it?...

[Bug]: Can't use yarn rope config for long context in Qwen2 model

> > I think this is link maybe help. https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct/discussions/15#670a76a4e3d216b4243b2d4d In short, I Guess : vllm nedd original_max_position_embeddings but transformers do not allow this key. So, anybody knows how to...

xtuner 微调internLM2.5出错

mmengine不支持2.5.x的pytorch，应该是命名冲突了，改下mmengine的源码或者把pytorch降级一下就行

When seq_parallel_world_size is set to a value greater than 1, should use_varlen_attn not be set to true?

同见过这个bug， SP要求输入token长度能被整除，但是实际上没能成功对齐，最后会导致训练label出问题 xtuner/xtuner/dataset/utils.py 中，可以设置一个参数：drop_last 这样丢掉后面的内容就可以了。