Hessa Alawwad issues

Results 7 issues of


                                            Hessa Alawwad

Get the multimodal embeddings

Thank you for the great model. I wonder how can I get the multimodat embedding of different inputs like image and its caption usign Imagebind? if I can get that...

Can llama3.2 vision accept no image?

Hello, thank you for thegreat work. I have been trying to expermint with the model and see how it works. My question is: Can I use llama3.2 vision to cover...

RuntimeError: Input is too long for context length 77. No truncation passed

Hello, So I am trying to embed text using CLIP, I got the error that my text is too long but from the huggingface I see that I can fix...

RuntimeError: CUDA error: device-side assert triggered

Hello, I am trying the following code to test sending multiple images: ``` import requests import torch from PIL import Image from transformers import MllamaForConditionalGeneration, AutoProcessor model_id = "meta-llama/Llama-3.2-11B-Vision-Instruct" #...

llama3.2 training loss is always zero

Hello, I am tryoing to SFT train llama3.2 11B vision instruct model. on a dataset that answer a question on an image using a context (could be more than one...

Train on completions only by fixing the collator inquiry

Hello, I was wondering if I would be able to use the DataCollatorForCompletionOnlyLM to train Llama 3.2 vision model on the generated prompts only? Something like passing a response template...

TypeError: 'module' object is not callable

Hello, I am trying to do the following: ``` from imagebind import data from imagebind.models import imagebind_model from imagebind.models.imagebind_model import ModalityType def getEmbeddingVector(inputs): with torch.no_grad(): embedding = imagebind_model(inputs) for key,...