mergoo icon indicating copy to clipboard operation
mergoo copied to clipboard

A library for easily merging multiple LLM experts, and efficiently train the merged LLM.

Results 7 mergoo issues
Sort by recently updated
recently updated
newest added

Hi, thanks for the library! When we try to compose LoRA experts that have k_proj, up_proj, down_proj in the target_modules, we face a shape mismatch error. Everything works fine when...

bug

Before llama come to our sight, Big Science release MT0 and T0, which are forerunners of llm, and at the end of 2022, some repos take T5-large and T5-3b T5-11b...

enhancement

Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.2 to 4.66.3. Release notes Sourced from tqdm's releases. tqdm v4.66.3 stable cli: eval safety (fixes CVE-2024-34062, GHSA-g7vv-2v7x-gj9p) Commits 4e613f8 Merge pull request from GHSA-g7vv-2v7x-gj9p b53348c cli:...

dependencies

Hello, thanks for providing this amazing tool. Could mergoo support QWEN models?

From a design perspective, [Here](https://github.com/Leeroo-AI/mergoo/blob/main/mergoo/models/modeling_llama.py#L242), shall we consider to add the original `x` to the hidden states of `down_proj`?

enhancement

Hi there, thanks mergoo, an amazing code base for MoE model construction. A crucial feature that may need to be implemented is that mergoo should let the user select the...

attempted: https://github.com/Leeroo-AI/mergoo/blob/main/notebooks/integrate_phi3_experts.ipynb when loading the model with: import torch from mergoo.models.modeling_phi3 import Phi3ForCausalLM model_id = 'data/checkpoint_demo' // Define the device (use cuda:0 or another GPU if necessary) device = torch.device('cuda:0'...