mergoo
mergoo copied to clipboard
A library for easily merging multiple LLM experts, and efficiently train the merged LLM.
Hi, thanks for the library! When we try to compose LoRA experts that have k_proj, up_proj, down_proj in the target_modules, we face a shape mismatch error. Everything works fine when...
Before llama come to our sight, Big Science release MT0 and T0, which are forerunners of llm, and at the end of 2022, some repos take T5-large and T5-3b T5-11b...
Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.2 to 4.66.3. Release notes Sourced from tqdm's releases. tqdm v4.66.3 stable cli: eval safety (fixes CVE-2024-34062, GHSA-g7vv-2v7x-gj9p) Commits 4e613f8 Merge pull request from GHSA-g7vv-2v7x-gj9p b53348c cli:...
Hello, thanks for providing this amazing tool. Could mergoo support QWEN models?
From a design perspective, [Here](https://github.com/Leeroo-AI/mergoo/blob/main/mergoo/models/modeling_llama.py#L242), shall we consider to add the original `x` to the hidden states of `down_proj`?
Hi there, thanks mergoo, an amazing code base for MoE model construction. A crucial feature that may need to be implemented is that mergoo should let the user select the...
attempted: https://github.com/Leeroo-AI/mergoo/blob/main/notebooks/integrate_phi3_experts.ipynb when loading the model with: import torch from mergoo.models.modeling_phi3 import Phi3ForCausalLM model_id = 'data/checkpoint_demo' // Define the device (use cuda:0 or another GPU if necessary) device = torch.device('cuda:0'...