Results 25 comments of tsantra

@dosubot The example I am referring to: https://www.llamaindex.ai/blog/multimodal-rag-for-advanced-video-processing-with-llamaindex-lancedb-33be4804822e, uses retrieved images from the image vector store along with the prompt as input to the final multimodal LLM shown below: from...

@dosubot could you give the code snippet using HuggingFaceLLM which does the same thing as the snippet below: from llama_index.multi_modal_llms.openai import OpenAIMultiModal openai_mm_llm = OpenAIMultiModal( model="gpt-4-turbo", api_key= "sk-proj-4jX30IwiomEZDezL9pJJT3BlbkFJefAyiBjUhXUlYYW1kFdL", max_new_tokens=1500 )...

@dosu, what is the code for the complete method from HuggingFaceLLM, if my input to the llm is a pair of text and list of images.

@dosu I do not want to convert the image into captions, rather send a list of images and a text prompt as input to the llm using HuggingFaceLLM just like...

@dosubot could you please provide the code for the placeholder part mentioned in your answer above for the function complete. The goal is to use both images and text prompt...

@dosubot Please rewrite the code, if my LLM is https://huggingface.co/llava-hf/llava-1.5-7b-hf. Please make the necessary changes.

@ivy-lv11 thank you ! This works! Finally, I want to use the llm in the RAG pipeline. Using the llm = TransformersLLM(model_id="",model=model,tokenizer=tokenizer), is not generating any output for me. My...

hi @lzivan could you please let me know if you have any update. Thank you very much.

hi @lzivan Could you please let me know if there is any update? thanks a lot!

@lzivan LlaVA works fine with Langchain using Ollama. So LLaVA is compatible.