Hao Zhang comments

Results 174 comments of


                                            Hao Zhang

how can convert the target of 'python3 -m fastchat.model.apply_delta' to llama.cpp

Why don't you use Fastchat to serve? You can apply the delta to get the vicuna weights. Finetune on top of Vicuna weights. Then serve it right? Meanwhile, we do...

overfit and why?

There are two few pieces of information about your data and your detailed setup. We're unable to offer any help give these information. Closing. If you still face the problem,...

how can convert the target of 'python3 -m fastchat.model.apply_delta' to llama.cpp

Closing as this issue is no longer relevant with FastChat development.

any one has evaluate the performance of programming?

@friendmine - In terms of evaluation -- you can run vicuna on the human-eval benchmark. Since we do not specialize it to coding, I believe there exist many better alternative...

Get Error:use_cache is not supported, when finetune

@zhangsanfeng86 why would you need to use cache for fine-tuning? It is only useful for decoding.

Get Error:use_cache is not supported, when finetune

Seems like the issue is resolved.

The conversion command cannot locate the installed transformers

Seems like the user's local setup issue. Closing.

Multi-gpu demo failed on two A6000

@ganler @weiddeng It is a tokenizer version issue. https://github.com/lm-sys/FastChat/issues/199#issuecomment-1537618299 Please refer to the above issue for the solution. Let us know if it is solved. Feel free to re-open.

vicuna-13b weird answer

This is because we did an upgrade on the model weights. Please follow what @merrymercy suggested to upgrade the model weights and fastchat versions.

Increase input limit for the model

That's a good suggestion. But it is hard to increase the limit because of the significant increase in memory and compute. We'll try to investigate, though.