Hao Zhang

Results 174 comments of Hao Zhang

Why don't you use Fastchat to serve? You can apply the delta to get the vicuna weights. Finetune on top of Vicuna weights. Then serve it right? Meanwhile, we do...

There are two few pieces of information about your data and your detailed setup. We're unable to offer any help give these information. Closing. If you still face the problem,...

Closing as this issue is no longer relevant with FastChat development.

@friendmine - In terms of evaluation -- you can run vicuna on the human-eval benchmark. Since we do not specialize it to coding, I believe there exist many better alternative...

@zhangsanfeng86 why would you need to use cache for fine-tuning? It is only useful for decoding.

Seems like the user's local setup issue. Closing.

@ganler @weiddeng It is a tokenizer version issue. https://github.com/lm-sys/FastChat/issues/199#issuecomment-1537618299 Please refer to the above issue for the solution. Let us know if it is solved. Feel free to re-open.

This is because we did an upgrade on the model weights. Please follow what @merrymercy suggested to upgrade the model weights and fastchat versions.

That's a good suggestion. But it is hard to increase the limit because of the significant increase in memory and compute. We'll try to investigate, though.