candle icon indicating copy to clipboard operation
candle copied to clipboard

Model Wishlist

Open LaurentMazare opened this issue 2 years ago • 99 comments

This issue aims at keeping track of the models that would be interesting to get added to candle. Feel free to make a comment to mention a new model, or vote for a model already in the list.

Added recently:

  • JinaBert, embeddings model with an apache 2 license and a larger context (+ using alibi biases rather than rope embeddings).
  • Marian-MT, a neural machine translation model.
  • Yi-6b / Yi-34b, bilingual (English/Chinese) LLM.

LaurentMazare avatar Oct 25 '23 19:10 LaurentMazare

Decent images with 4 steps of inference

Apache licensed and has a fairly large community. Perhaps a minimal port as an example.

StyleTTS2 MIT licensed (Possibly better and faster than tortoise tts)

phudtran avatar Oct 26 '23 03:10 phudtran

The new JinaBert Embeddings is small and has Apache license

radames avatar Oct 26 '23 04:10 radames

The new JinaBert Embeddings is small and has Apache license

This looks like one I'd love to help migrate; @LaurentMazare I can create an issue and get started on this?

ToluClassics avatar Oct 26 '23 14:10 ToluClassics

The new JinaBert Embeddings is small and has Apache license

This looks like one I'd love to help migrate; @LaurentMazare I can create an issue and get started on this?

Ah sorry actually I'm mostly done with it, see #1187 (though I still have to line up the embedding values properly but all the rest is in place) - I had a couple hours in the train this afternoon :)

LaurentMazare avatar Oct 26 '23 14:10 LaurentMazare

The new JinaBert Embeddings is small and has Apache license

This looks like one I'd love to help migrate; @LaurentMazare I can create an issue and get started on this?

Ah sorry actually I'm mostly done with it, see #1187 (though I still have to line up the embedding values properly but all the rest is in place) - I had a couple hours in the train this afternoon :)

Late to the party 😅, I'd keep an eye out for this list then.

ToluClassics avatar Oct 26 '23 14:10 ToluClassics

@radames the jina-bert bits have been merged and I checked on some examples that the generated embeddings line up properly with the python version so should be all good. I will just polish the example a bit to properly download the tokenizer and weight files from the hub if needed. @ToluClassics and beside the list, if you have some models that you would feel interested by, that's certainly a great way to get started.

LaurentMazare avatar Oct 26 '23 17:10 LaurentMazare

can support https://github.com/infinitered/nsfwjs or https://github.com/bhky/opennsfw2 for nsfw detection?

Liangdi avatar Oct 28 '23 09:10 Liangdi

Would it be possible to show how to use a marian translation model in candle ?

There is already an example in :

  • https://github.com/guillaume-be/rust-bert/blob/main/examples/translation_marian.rs
  • https://github.com/guillaume-be/rust-bert/blob/dc99a30204ffcef98aee3f697ac90513ad67773d/src/models/marian/marian_model.rs#L4

The modeling_marian.py modeling file is already available in transformers :

  • https://github.com/huggingface/transformers/blob/main/src/transformers/models/marian/modeling_marian.py

Marian translation models being lighter than their counterparts they are well-suited for serverless application. Candle being lighter than rust-bert and relying less on tch-rs I expect this would lighten and ease the whole build process.

flutter-painter avatar Oct 28 '23 17:10 flutter-painter

@flutter-painter the marian-mt model should now be available in candle, we have an example that uses it to translate from french to english which you can find here.

LaurentMazare avatar Oct 30 '23 17:10 LaurentMazare

Thanks @LaurentMazare I just tested and it works. You are blazingly fast !

flutter-painter avatar Oct 30 '23 21:10 flutter-painter

@flutter-painter the marian-mt model should now be available in candle, we have an example that uses it to translate from french to english which you can find here.

thanks for amaing works, but could you please tell me how to get tokenzier-marian-{lang}.json ? i try to get from python code

tokenizer = MarianTokenizer.from_pretrained(model_name)
tokenizer.save_pretrained("./out")

but it does not work. I noticed that there is that file "convert_slow_tokenizer.py", do I need to use the function of this file? how to use? thanks very much!

Liangdi avatar Oct 31 '23 00:10 Liangdi

thanks for amaing works, but could you please tell me how to get tokenzier-marian-{lang}.json ? i try to get from python code

tokenizer = MarianTokenizer.from_pretrained(model_name)
tokenizer.save_pretrained("./out")

but it does not work. I noticed that there is that file "convert_slow_tokenizer.py", do I need to use the function of this file? how to use? thanks very much!

Ah that part is indeed very hacky at the moment. You have to download this convert_slow_tokenizer.py script that you discovered and from the same directory running the following python code shoud produce the two tokenizer files.

from convert_slow_tokenizer import MarianConverter
from transformers import AutoTokenizer


tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-fr-en", use_fast=False)
fast_tokenizer = MarianConverter(tokenizer, index=0).converted()
fast_tokenizer.save(f"tokenizer-marian-base-fr.json")
fast_tokenizer = MarianConverter(tokenizer, index=1).converted()
fast_tokenizer.save(f"tokenizer-marian-base-en.json")

LaurentMazare avatar Oct 31 '23 05:10 LaurentMazare

ChatGLM3 huggingface : https://huggingface.co/THUDM/chatglm3-6b

Liangdi avatar Nov 01 '23 02:11 Liangdi

ChatGLM3 huggingface : https://huggingface.co/THUDM/chatglm3-6b

That sounds like a great model to have, happy to prioritize it. Do you know if the tokenizer config can be easily converted to a working tokenizer.json or maybe there is already such a config somewhere? (similar to what was done for marian but hopefully less flaky)

LaurentMazare avatar Nov 01 '23 19:11 LaurentMazare

How are we doing on TTS? Would be nice to have e.g. Bark.

Also, +1 on nougat

EmilLindfors avatar Nov 03 '23 07:11 EmilLindfors

anather embeddings model https://huggingface.co/moka-ai/m3e-large

Liangdi avatar Nov 06 '23 01:11 Liangdi

anather embeddings model https://huggingface.co/moka-ai/m3e-large

I think this one may just work out of the box as it uses a standard bert model which already has been added. You could try it out with the following:

cargo run --example bert -- --model-id moka-ai/m3e-large --revision refs/pr/5

LaurentMazare avatar Nov 06 '23 07:11 LaurentMazare

anather embeddings model https://huggingface.co/moka-ai/m3e-large

I think this one may just work out of the box as it uses a standard bert model which already has been added. You could try it out with the following:

cargo run --example bert -- --model-id moka-ai/m3e-large --revision refs/pr/5

it works , thank you!

Liangdi avatar Nov 06 '23 08:11 Liangdi

@LaurentMazare another LLM https://github.com/01-ai/Yi , they coverted the tokeninzer.json , i tested with Tokenizers , https://github.com/01-ai/Yi/issues/24#issuecomment-1801680600 , can candle support this Model?

Liangdi avatar Nov 08 '23 14:11 Liangdi

@LaurentMazare another LLM https://github.com/01-ai/Yi , they coverted the tokeninzer.json , i tested with Tokenizers , 01-ai/Yi#24 (comment) , can candle support this Model?

Do you know if this is the same tokenizer as for ChatGLM3? If that's the case I would prefer pushing on this one first as I have a PR that should be mostly done with the implementation and only requires lining up the logits once we have a proper tokenizer config.

(edit: I misremembered the PR, it's not mostly complete and requires a bit of work implementing the forward passes but this should be pretty quick to do once we have a tokenizer to test out)

LaurentMazare avatar Nov 09 '23 08:11 LaurentMazare

@LaurentMazare another LLM https://github.com/01-ai/Yi , they coverted the tokeninzer.json , i tested with Tokenizers , 01-ai/Yi#24 (comment) , can candle support this Model?

Just merged support for the yi-6b and yi-34b variants in #1320 , I haven't tested them much though as my current computer is very slow even on the 6b - not sure how much of that is expected. It would certainly be great if you can give these a spin and let me know how it goes.

LaurentMazare avatar Nov 11 '23 11:11 LaurentMazare

@LaurentMazare another LLM https://github.com/01-ai/Yi , they coverted the tokeninzer.json , i tested with Tokenizers , 01-ai/Yi#24 (comment) , can candle support this Model?

Just merged support for the yi-6b and yi-34b variants in #1320 , I haven't tested them much though as my current computer is very slow even on the 6b - not sure how much of that is expected. It would certainly be great if you can give these a spin and let me know how it goes.

thank you very much, i'll do some testing and i'll follow your pr , try to convert other LLMs

Liangdi avatar Nov 11 '23 11:11 Liangdi

Just for curiosity, is there any plan for supporting more embedding models such as deberta or BAAI/bge-large-en-v1.5?

YeonwooSung avatar Nov 14 '23 06:11 YeonwooSung

Hey :wave:, is someone working on porting LCM to Candle ?

julien-blanchon avatar Nov 14 '23 10:11 julien-blanchon

Just for curiosity, is there any plan for supporting more embedding models such as deberta or BAAI/bge-large-en-v1.5?

As this seems to just be a bert variant, this could work directly with the current bert model provided by candle.

cargo run --example bert --release -- --model-id BAAI/bge-large-en-v1.5 --revision refs/pr/5

LaurentMazare avatar Nov 14 '23 14:11 LaurentMazare

Hey 👋, is someone working on porting LCM to Candle ?

Could you provide some links/details on what LCM is?

LaurentMazare avatar Nov 14 '23 14:11 LaurentMazare

Hey 👋, is someone working on porting LCM to Candle ?

Could you provide some links/details on what LCM is?

Yep, I opened an dedicated issue here

julien-blanchon avatar Nov 14 '23 18:11 julien-blanchon

EfficientSAM: https://github.com/xetdata/EfficientSAM

radames avatar Dec 06 '23 22:12 radames

Re text-to-speech, we've just added an early version of metavoice, you can try it out via this example.

LaurentMazare avatar Mar 02 '24 20:03 LaurentMazare

Would anyone has interest for running moondream2 if it was added to candle? Looks like a small and efficient model that wouldn't be too hard to add.

LaurentMazare avatar Mar 12 '24 11:03 LaurentMazare