Pedro Cuenca

Results 331 comments of Pedro Cuenca

`text-to-image` is currently CV as well. There have been various discussions about what "multimodal" should include, the latest update was done [in this PR](https://github.com/huggingface/huggingface.js/pull/477) where some comments were shared. cc...

@SwayStar123 I can't, let's wait for a maintainer to do it.

I took the liberty to resolve the conflict with `main`

I could take a look tomorrow, if that works.

> while swift has this (without the image tokens injected yet): > > ``` > user > Describe the image in English > model > ``` ^ This version of...

I suspect the attention scale; testing

It works. Pushing in a sec. We also need to use `` as a terminator, otherwise it will keep generating those tokens non-stop.

Here it is @DePasqualeOrg https://github.com/DePasqualeOrg/mlx-swift-examples/pull/1 🤗

Regarding text-only mode not working, this is also happening in the Python version now - not sure what happened yet, will look into it later.

Hello @hybotix! I'd recommend you follow [these download instructions](https://github.com/meta-llama/llama-models/tree/main?tab=readme-ov-file#download), or [these ones](https://github.com/meta-llama/llama-models/tree/main?tab=readme-ov-file#access-to-hugging-face) to download from Hugging Face.