FuseAI issues

can use Qwen1.5-7B-Chat ?

11

exec readme bash Pairwise Knowledge Fusion FuseLLM/FuseChat/train/trainer.py", line 121, in compute_loss if self.args.distill_loss_type == "ce": loss_lm = cross_entropy(input=outputs["logits"].view(-1, vocab_size), target=target_dist.view(-1, vocab_size), reduction="none").view(batch_size, -1) # (bs, seq_len) RuntimeError: shape '[-1, 151936]'...

18600709862

good first issue

Releasing FuseLLM for Korean

Hello, thanks for helping me out in last few weeks. I have successfully blended the OrionStarAI model with the beomi/OPEN-SOLAR-KO-10.7B and beomi/YI-KO-7B. The blended model has shown some progress over...

sigridjineth

good first issue

Comparison with merging LLaMA-2 CLM and LLaMA-2

1

Within https://github.com/fanqiwan/FuseLLM/blob/main/assets/fig_4.png, it would be nice to be able to see a comparison of LLaMA-2, LLaMA-2 CLM, FuseLLM, and a merge using just LLaMA-2, LLaMA-2 CLM as well

leegao

Out of Memory Issue with OpenLLaMA-7B in Default FuseLLM Setting on A100 (80G)

2

# Description I am currently attempting to reproduce the results of your excellent work, FuseLLM, following the doc ([https://github.com/18907305772/FuseAI/blob/main/FuseLLM/README.md](https://github.com/18907305772/FuseAI/blob/main/FuseLLM/README.md)). During these operations, I am encountering an Out of Memory (OOM)...

runtsang

Questions about AlpacaEval 2.0 evaluation

1

Hello, we encountered some issues while reproducing the test results in the paper. On the AlpacaEval 2.0, we noticed that your GitHub page stated that you followed the default settings...

lucasz05

Question about comparison with ensemble methods

Could you give more details about ensemble baselines (i.e. Top1-LLM-Blender & Top1-PPL 162B)? What large language models do you choose to compose ensemble learning?

qq31415926

about gpt-4-0125-preview reference answer

4

hello, 我想咨询一下在MT-bench上测试时，使用的reference answer 是通过 gen_api_answer.py --model gpt-4-0125-preview这个命令来获取的吗？生成的reference answer有80个，然后把其中100～130个用official comment[https://github.com/lm-sys/FastChat/pull/3158](url)这里的正确的30个进行替换吗？总结一下；judge model 是用gpt-4-0125-preview，但是80个问题的reference answer 是怎么获取呢，judge的结果是可复现的还是会有波动呢？

duguodong7

good first issue

FuseAI
FuseAI copied to clipboard

Metadata

can use Qwen1.5-7B-Chat ?

Releasing FuseLLM for Korean

Comparison with merging LLaMA-2 CLM and LLaMA-2

Out of Memory Issue with OpenLLaMA-7B in Default FuseLLM Setting on A100 (80G)

Questions about AlpacaEval 2.0 evaluation

Question about comparison with ensemble methods

about gpt-4-0125-preview reference answer

← Metadata

Owner

Metadata

FuseAI FuseAI copied to clipboard

Metadata

can use Qwen1.5-7B-Chat ?

Releasing FuseLLM for Korean

Comparison with merging LLaMA-2 CLM and LLaMA-2

Out of Memory Issue with OpenLLaMA-7B in Default FuseLLM Setting on A100 (80G)

Questions about AlpacaEval 2.0 evaluation

Question about comparison with ensemble methods

about gpt-4-0125-preview reference answer

← Metadata

Owner

Metadata

FuseAI
FuseAI copied to clipboard