llama_index icon indicating copy to clipboard operation
llama_index copied to clipboard

how can I use my model to predict answer?

Open KAMIENDER opened this issue 1 year ago • 6 comments

How can I use my model to predict the answer? I want to use my llama model for prediction instead of relying on OpenAI's API. It seems like all I need to do is implement LLMPredictor. Are there any other simple methods for utilizing my own model to make predictions besides this?

KAMIENDER avatar Mar 12 '23 09:03 KAMIENDER

You are correct, you just need to implement LLMPrefictor 💪

At that point though, it's up to you to handle tokenization, model inference, etc.

Here's a small example that uses Flan-T5 and huggingface code https://github.com/jerryjliu/gpt_index/issues/544

logan-markewich avatar Mar 13 '23 05:03 logan-markewich

While running the example in issue #544 (after changing from gpt_index to llama_index and by using flan-5g-small) I get the following error: ValidationError: 1 validation error for OpenAI root Did not find openai_api_key, please add an environment variable OPENAI_API_KEY which contains it, or pass openai_api_key as a named parameter. (type=value_error)

Edit: seems that one needs to pass llm_predictor to GPTListIndex.load_from_disk to fix this.

thohag avatar Mar 13 '23 15:03 thohag

I post the updated example from issue #544 here for completeness:

from llama_index import SimpleDirectoryReader, LangchainEmbedding, GPTListIndex
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from llama_index import LLMPredictor
import torch
from langchain.llms.base import LLM
from transformers import pipeline


class FlanLLM(LLM):
    model_name = "google/flan-t5-small"
    pipeline = pipeline("text2text-generation", model=model_name, device="cpu", model_kwargs={"torch_dtype":torch.bfloat16})

    def _call(self, prompt, stop=None):
        return self.pipeline(prompt, max_length=9999)[0]["generated_text"]

    def _identifying_params(self):
        return {"name_of_model": self.model_name}

    def _llm_type(self):
        return "custom"


llm_predictor = LLMPredictor(llm=FlanLLM())
hfemb = HuggingFaceEmbeddings()
embed_model = LangchainEmbedding(hfemb)


documents = SimpleDirectoryReader('data').load_data()
index = GPTListIndex(documents, embed_model=embed_model, llm_predictor=llm_predictor)

index.save_to_disk('index.json')

new_index = GPTListIndex.load_from_disk('index.json', embed_model=embed_model, llm_predictor=llm_predictor)

response = new_index.query("What did the author do growing up?",
    mode="embedding", 
    embed_model=embed_model, llm_predictor=llm_predictor
    )

thohag avatar Mar 13 '23 15:03 thohag

While running the example in issue #544 (after changing from gpt_index to llama_index and by using flan-5g-small) I get the following error: ValidationError: 1 validation error for OpenAI root Did not find openai_api_key, please add an environment variable OPENAI_API_KEY which contains it, or pass openai_api_key as a named parameter. (type=value_error)

Edit: seems that one needs to pass llm_predictor to GPTListIndex.load_from_disk to fix this.

Yes... I also have this problem.

KAMIENDER avatar Mar 13 '23 16:03 KAMIENDER

@KAMIENDER It seems that you do not need the api key while passing llm_predictor to GPTListIndex.load_from_disk ?

thohag avatar Mar 13 '23 16:03 thohag

@KAMIENDER It seems that you do not need the api key while passing llm_predictor to GPTListIndex.load_from_disk ?

No. I use GPTTreeIndex, it also needs the key

Is passing llm_predictor to GPTTreeIndex.load_from_disk solving the problem?

thohag avatar Mar 13 '23 16:03 thohag

Hey everyone, closing this issue since the original question about how to use a custom LLM has been addressed.

Please direct further discussion to discord, where you'd get much better support from the community https://discord.com/invite/dGcwcsnxhU.

Disiok avatar Mar 18 '23 17:03 Disiok