Ma Xinyin comments

Results 58 comments of


                                            Ma Xinyin

Reproduce the recall result on Zero-shot EL dataset

I met the same problem with @xiepuzhao. Is this because the step of knowledge distillation is not included in train_biencoder.py?

Reproduce the recall result on Zero-shot EL dataset

@leezythu In my previous experiment, I set gradient_accumulation_steps to 8. However, batch size is a particularly important hyper-parameter for this experiment, and if the gradient accumulation is used, then the...

Results are unusable - what am I doing wrong?

Hi Sebastian, Thanks for trying our project for pruning LLaMA. After pruning a model, it is imperative to perform post-training before using it for any further application. This is because...

Results are unusable - what am I doing wrong?

Hi Sebastian, Thanks for your advice! We have modified the readme to make it clear. As for the plan, I have so many deadlines in the coming weeks. So it...

Results are unusable - what am I doing wrong?

Hi all, We found a huge bug in our pruning code and we are working on it to see the reason and the way to fix it. The repo will...

Results are unusable - what am I doing wrong?

We have updated the code in https://github.com/horseee/LLM-Pruner. Please refer to the new repo-v-

Benchmark for LLaMA pruning

Hi. It's on our waitlist, but it requires a large amount of time and resources to conduct post-training on the pruned model. Otherwise, the pruned model is poorly functioning. We...

Benchmark for LLaMA pruning

We have updated the evaluation results in https://github.com/horseee/LLM-Pruner. Please refer to the new repo-v-

欢迎分享CVPR 2024 论文和代码 / Welcome to share the paper and code of CVPR 2024

Paper name/title: DeepCache: Accelerating Diffusion Models for Free Paper link: https://arxiv.org/abs/2312.00858 Code link: https://github.com/horseee/DeepCache

Openai Whisper pruning

Hi. The new code will be released in around one week 🤠