tsantra
tsantra
@dosubot The example I am referring to: https://www.llamaindex.ai/blog/multimodal-rag-for-advanced-video-processing-with-llamaindex-lancedb-33be4804822e, uses retrieved images from the image vector store along with the prompt as input to the final multimodal LLM shown below: from...
@dosubot could you give the code snippet using HuggingFaceLLM which does the same thing as the snippet below: from llama_index.multi_modal_llms.openai import OpenAIMultiModal openai_mm_llm = OpenAIMultiModal( model="gpt-4-turbo", api_key= "sk-proj-4jX30IwiomEZDezL9pJJT3BlbkFJefAyiBjUhXUlYYW1kFdL", max_new_tokens=1500 )...
@dosu, what is the code for the complete method from HuggingFaceLLM, if my input to the llm is a pair of text and list of images.
@dosu I do not want to convert the image into captions, rather send a list of images and a text prompt as input to the llm using HuggingFaceLLM just like...
@dosubot could you please provide the code for the placeholder part mentioned in your answer above for the function complete. The goal is to use both images and text prompt...
@dosubot Please rewrite the code, if my LLM is https://huggingface.co/llava-hf/llava-1.5-7b-hf. Please make the necessary changes.
@ivy-lv11 thank you ! This works! Finally, I want to use the llm in the RAG pipeline. Using the llm = TransformersLLM(model_id="",model=model,tokenizer=tokenizer), is not generating any output for me. My...
hi @lzivan could you please let me know if you have any update. Thank you very much.
hi @lzivan Could you please let me know if there is any update? thanks a lot!
@lzivan LlaVA works fine with Langchain using Ollama. So LLaVA is compatible.