deadlykitten4 comments

Results 9 comments of


                                            deadlykitten4

About data split

Hi, do you understand what does the train/8 mean? I am also confused about this.

[Bug]: ValueError: Model architectures ['LlamaForCausalLM'] failed to be inspected. Please check the logs for more details

I used `pip install vllm` to install. And I found that if run script in the directory vllm, then it won't show this error. I don't know the reason why...

Inquiry about the evaluation part

Hi, I am still confused about the evaluation of the throughput, cause the result I got (as you can see in the picture) is quite different from the result in...

Inquiry about the evaluation part

@ldengjie Nope, I still can't figure out what is the problem of it. And most of my results show that there is no speedup compared to the original model. And...

Can not reproduce the results of Llama-3-8B

Hi, @dellixx, may I ask how you run this code on LLaMA3? Cause I have upgraded the transformers version and modified the SVD_LlamaAttention class, but I obtained an extremely bad...

我在微调example里面的qwen2_5vl_lora_sft.yaml示例，5090+64g内存报错oom

你用deepspeed了吗，貌似在你提供的脚本里没有看到，加上deepspeed试试

在加载bookcorpus的过程中，builder_cls为None

Very useful solution!! Thanks @pennyLuo-hub

Inquiry about the loglikelihood_rolling function

I want to evaluate the ppl (perplexity) of LLaDA. But the results are extremely bad based on my implementation. I think there may be some issues I didn't figure out....

Inquiry about the loglikelihood_rolling function

Okay, thanks for your help!