GPTCache icon indicating copy to clipboard operation
GPTCache copied to clipboard

[Bug]: Error with AzureChatOpenAI using langchain - [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: token_type_ids for the following indices index: 1 Got: 1772 Expected: 512 Please fix either the inputs or the model.

Open UmerQam opened this issue 2 years ago • 3 comments

Current Behavior

I am getting this error when using AzureChatOpenAI from Langchain

I tried implementing the GPT Similarity cache mentioned in the langchain page -https://python.langchain.com/docs/integrations/llms/llm_caching, but getting below error.

Please fix the below error

LangChain version: 0.0.304 openai version: 0.28.1 gptcache version: 0.1.42

[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: token_type_ids for the following indices index: 1 Got: 1772 Expected: 512 Please fix either the inputs or the model.

Expected Behavior

Error should not be there

Steps To Reproduce

No response

Environment

No response

Anything else?

No response

UmerQam avatar Sep 29 '23 10:09 UmerQam

You can try to clean the cache dir. When using cache, please keep the same embedding. If you change the embedding method, you need to delete the previous cache directory.

SimFG avatar Sep 30 '23 02:09 SimFG

Hi @SimFG , I deleted the cache multiple times, but its not helping

About the embedding, I am using below code, do I need to make some changes here ? I had created Vertex AI embeddings for my pdfs, word files with dimension of 786 and stored it in Matching Engine/ Vector Store.

I tried below code, (Its from Langchain official page - https://python.langchain.com/docs/integrations/llms/llm_caching)

from gptcache import Cache
from gptcache.adapter.api import init_similar_cache
from langchain.cache import GPTCache
import hashlib


def get_hashed_name(name):
    return hashlib.sha256(name.encode()).hexdigest()


def init_gptcache(cache_obj: Cache, llm: str):
    hashed_llm = get_hashed_name(llm)
    init_similar_cache(cache_obj=cache_obj, data_dir=f"similar_cache_{hashed_llm}")


langchain.llm_cache = GPTCache(init_gptcache)

UmerQam avatar Oct 04 '23 12:10 UmerQam

You need to confirm where the 1772-dimensional vector comes from

SimFG avatar Oct 07 '23 02:10 SimFG