HappyNews comments

Results 7 comments of


                                            HappyNews

ValueError: Some specified arguments are not used by the HfArgumentParser: ['--local-rank=0']

Have you solve your problems? I came up with the same error when using deepspeed. Solutions provided above didn't work at all. :(

Format conversion issue after downstream SFT

In addition, in `setup_env.py` file I just modifed the `gen_code()` method to make it do the same thing as `get_model_name() == "BitNet-b1.58-2B-4T"`

Format conversion issue after downstream SFT

> [@LiuZhihhxx](https://github.com/LiuZhihhxx) Have you solved this problem? Not yet. It seems an essential step for downstream application.

training code

I am trying SFT for my downstream task. I think `Trainer` from `trl` may work.

a demo sample problem

128k的词表，跟llama3一样大的，中文分词应该没啥问题。但是这个模型没怎么在中文语料上训练过，需要自己微调对齐一下。

So smart

没在中文上做过预训练，图中明显存在中英文掺杂的问题。如果不添加“使用中文回复”，该问题更明显。

可以先从server获取全部tool list，然后自定义逻辑筛选工具后，手动传入。不是很优雅，但目前暂时没看到别的方法。获取工具可以参考以下代码： ```python import asyncio from mcp.client.sse import sse_client from mcp.client.session import ClientSession url = 'http://localhost:7764/sse' async def get_tools(): async with sse_client(url) as streams: # 换成对应的MCP地址 async with ClientSession(*streams) as...

HappyNews

ValueError: Some specified arguments are not used by the HfArgumentParser: ['--local-rank=0']

Format conversion issue after downstream SFT

Format conversion issue after downstream SFT

training code

a demo sample problem

So smart

在sse连接的情况下，如何只注册部分tools