Hao Zhang comments

Results 174 comments of


                                            Hao Zhang

I am very eager to know what 's the meaning of L^2 error?Thanks for you

Are you saying L2 error? Hao On Sun, May 21, 2017 at 4:31 AM, ningning32 wrote: > How to calculate L^2 error? > > — > You are receiving this...

Deepspeed support and config file?

It seems the training speed with Deepspeed isn't great. We'll add some better model-parallel training support soon. Closing this ticket.

accroding to data_cleaning.md, how to get sharegpt_20230322_html.json firstly?

Unfortunately we're unable to help on this issue. @eeric maybe try to do some search on hugging face?

Missing import from train/train.py for LORA training

Is the issue resolved?

How long does it take for the llama team to respond to the weight request form?

It can take from 1-2 days to 2 months, based on my experience.

Awesome Jobs guys - Got it work in Docker it's fast on a my 3060 and even faster on my 3080

You can write a simple throughput calculator here https://github.com/lm-sys/FastChat/blob/main/fastchat/serve/inference.py#L98 and estimate the throughput (e.g., words/s) right? Contributions are welcome.

Awesome Jobs guys - Got it work in Docker it's fast on a my 3060 and even faster on my 3080

It seems the problem of this issue has resolved -- closing.

out of gpu memory using 4xA100 40G

Closing, as the issue has been resolved.

fastchat.model.apply_delta error

Please use the Vicuna 1.1 new weight delta and new apply_delta script, which shouldn't have any issue. Feel free to re-open if you find any issue!

any evaluation between 7B and 13B ?

yes, 7B is worse than 13B