Not able to load pretrained models
I am trying to use llmware's pretrained model 'industry-bert-contracts' in the code and facing the following error: "AttributeError: 'HFEmbeddingModel' object has no attribute 'max_input_len'"
I tried loading all other HFEmbeddingModel model family in model_config.py and still facing the same issue.
My code is working with other GGUFGenerativeModel's "bling-phi-3-gguf" "llmware/bling-sheared-llama-1.3b-0.1" models
@Arwin567 - thanks for sharing this and sorry that you ran into an issue. I suspect the problem is that "industry-bert-contracts" is an embedding model, not a generative model. So, if you are looking to run a prompt/LLM inference, then "bling-phi-3-gguf" and "llmware/bling-sheared-llama-1.3.-0.1" are great choices as they are both "generative" models ....
On the other hand, if you are interested to build a semantic embedding space for knowledge retrieval with a vector database, then "industry-bert-contracts" is a great choice, and it will output a 768 dimensional embedding vector (not a text generation). You may want to check out some of the Embedding examples, which use that model.
Hope this resolves this issue... please confirm back. (Really glad that you caught this - we will note this to add to the documentation to better clarify for others too.)
Within the ModelCatalog, you can use the discovery methods - ModelCatalog().list_all_models(), or ModelCatalog().list_generative_models() or ModelCatalog().list_embedding_models() ...
Understood, Thank you for clarification and instant response.
I am trying to retrieve data from contracts to csv with 15 questions and using your contract_analysis_on_laptop code. Can I use industry-bert-contracts model in this case?
@Arwin567 - yes, definitely, the industry-bert-contracts model is great for building semantic embeddings on contracts and other legal documents. Hope it is progressing well! Will close this thread.