mergekit issues

Separate MoE?

2

Hi, Sorry about asking so many questions, but do you know if it's possible to "unmerge" a MoE model and extract each expert as a separate model? For example, could...

fakerybakery

About PESC Method (Camelidae-8x..B Models)

1

Is there any plans to add the PESC method described in this [paper](https://arxiv.org/abs/2401.02731) and gave birth to these models [Camelidae-8x7B](https://huggingface.co/hywu/Camelidae-8x7B) [Camelidae-8x13B](https://huggingface.co/hywu/Camelidae-8x13B) and [Camelidae-8x34B](https://huggingface.co/hywu/Camelidae-8x34B) Check their repo [here](https://github.com/wuhy68/Parameter-Efficient-MoE/tree/master)

alielfilali01

Keep getting an error when making a mixtral model

5

whenever I make a mistral model using two llama2 13b models, I get the following error message: Traceback (most recent call last): File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code,...

A500000

question about 'hidden' gate method for mergekit-moe

2

Hi there, a questions about the process of merging different llms into moe. So, for mergekit-moe, if we use 'hidden' gate method, we have to provide at least one positive...

ZeyuTeng96

Is it possible to merge phi to llama/mistral?

1

I know that phi has different naming for its paramater. But, the algorithm is still transformers which contains self-attention and MLP. If we rename the parameter following llama/mistral format, is...

fahadh4ilyas

Try to add Qwen-moe into mixtral_moe.py

4

Hi, I try to add Qwen-moe into mixtral_moe.py, and I have done some modifications. But now, I meet some problems in there. ![1](https://github.com/cg123/mergekit/assets/53638291/000d5134-0fe0-4ba5-ad4c-745974c3dbee) I think it is wrong, because auto_map...

ZhangEnmao

Runtime ERROR

1

``` ===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues ================================================================================ CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... C:\Users\irene\AppData\Local\Programs\Python\Python310\lib\site-packages\bitsandbytes\cuda_setup\main.py:136: UserWarning:...

JamesKnight0001

Runtime error

2

giving error: ``` ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. lida...

andysingal

Merging models from different pre-trained backbone

1

Hi @cg123, would it be feasible to merge models from different pre-trained backbone? For example, can we merge a model fine-tuned on Mistral-7b with a model fine-tuned on Llama-2-7b? Or...

ReezDDDD

tokenizer error

1

this could be due to my own mistake but i cant seem for the life of me figure out why the merge works then the tokenizer merge fails. ``` Traceback...

Steel-skull

mergekit
mergekit copied to clipboard

Metadata

Separate MoE?

About PESC Method (Camelidae-8x..B Models)

Keep getting an error when making a mixtral model

question about 'hidden' gate method for mergekit-moe

Is it possible to merge phi to llama/mistral?

Try to add Qwen-moe into mixtral_moe.py

Runtime ERROR

Runtime error

Merging models from different pre-trained backbone

tokenizer error

← Metadata

Owner

Metadata

mergekit mergekit copied to clipboard

Metadata

← Metadata

Owner

Metadata

mergekit
mergekit copied to clipboard