FuseAI
FuseAI copied to clipboard
FuseLLM & FuseChat Project
exec readme bash Pairwise Knowledge Fusion FuseLLM/FuseChat/train/trainer.py", line 121, in compute_loss if self.args.distill_loss_type == "ce": loss_lm = cross_entropy(input=outputs["logits"].view(-1, vocab_size), target=target_dist.view(-1, vocab_size), reduction="none").view(batch_size, -1) # (bs, seq_len) RuntimeError: shape '[-1, 151936]'...
Hello, thanks for helping me out in last few weeks. I have successfully blended the OrionStarAI model with the beomi/OPEN-SOLAR-KO-10.7B and beomi/YI-KO-7B. The blended model has shown some progress over...
Within https://github.com/fanqiwan/FuseLLM/blob/main/assets/fig_4.png, it would be nice to be able to see a comparison of LLaMA-2, LLaMA-2 CLM, FuseLLM, and a merge using just LLaMA-2, LLaMA-2 CLM as well
# Description I am currently attempting to reproduce the results of your excellent work, FuseLLM, following the doc ([https://github.com/18907305772/FuseAI/blob/main/FuseLLM/README.md](https://github.com/18907305772/FuseAI/blob/main/FuseLLM/README.md)). During these operations, I am encountering an Out of Memory (OOM)...
Hello, we encountered some issues while reproducing the test results in the paper. On the AlpacaEval 2.0, we noticed that your GitHub page stated that you followed the default settings...
Could you give more details about ensemble baselines (i.e. Top1-LLM-Blender & Top1-PPL 162B)? What large language models do you choose to compose ensemble learning?
hello, 我想咨询一下在MT-bench上测试时,使用的reference answer 是通过 gen_api_answer.py --model gpt-4-0125-preview这个命令来获取的吗? 生成的reference answer有80个,然后把其中100~130个用official comment[https://github.com/lm-sys/FastChat/pull/3158](url)这里的正确的30个进行替换吗? 总结一下;judge model 是用gpt-4-0125-preview, 但是80个问题的reference answer 是怎么获取呢,judge的结果是可复现的还是会有波动呢?