text-generation-webui
text-generation-webui copied to clipboard
Support for FLAN models
Hi, I'm new to this and I'm not sure what would be involved, but does anyone know how to get FLAN-T5-large working? I have the following error after running python download-model.py google/flan-t5-large
then launching the webui. Windows 10, RTX 2080 Super (8GB VRAM).
Loading flan-t5-large...
Traceback (most recent call last):
File "C:\Users\---\text-generation-webui\server.py", line 188, in <module>
shared.model, shared.tokenizer = load_model(shared.model_name)
File "C:\Users\---\text-generation-webui\modules\models.py", line 49, in load_model
model = AutoModelForCausalLM.from_pretrained(Path(f"models/{shared.model_name}"), low_cpu_mem_usage=True, torch_dtype=torch.bfloat16 if shared.args.bf16 else torch.float16).cuda()
File "C:\Users\---\.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 474, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.t5.configuration_t5.T5Config'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, CodeGenConfig, CTRLConfig, Data2VecTextConfig, ElectraConfig, ErnieConfig, GitConfig, GPT2Config, GPT2Config, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, MarianConfig, MBartConfig, MegatronBertConfig, MvpConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, Speech2Text2Config, TransfoXLConfig, TrOCRConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig.
+1
Fun fact: Unlike LLaMA, FLAN is actually open source (Apache License).
People also mean flan-ul2, not just T5. I saw @zoidbb made 4 int versions of LLaMA, and I guess the functionality to add them here will come in a short while. Theoretically one could use 30B parameter models on 20 gb VRAM then. Since Flan-ul2 is quite chunky even in int8, it would be very handy to have 4 bit versions of flan-ul2 as well!
This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.
Is it possiable to download this model in local path google/flan-t5-large and use from there , I am using it but it is giving error
download.py from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
Specify the model name
model_name = "google/flan-t5-large"
Load the model and tokenizer
model = AutoModelForSeq2SeqLM.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name)
Save the model to a local path
model.save_pretrained("D:\gpt4free\chat-pdf-hugginface\model") tokenizer.save_pretrained("D:\gpt4free\chat-pdf-hugginface\model")
main.py
from dotenv import load_dotenv import os import streamlit as st from PyPDF2 import PdfReader from langchain.text_splitter import CharacterTextSplitter from langchain.embeddings.huggingface import HuggingFaceEmbeddings from langchain.vectorstores import FAISS from langchain.chains.question_answering import load_qa_chain from transformers import AutoModelForSeq2SeqLM,AutoTokenizer
from transformers import version as transformers_version
print(f"Transformers version: {transformers_version}")
def main(): load_dotenv() st.set_page_config(page_title="Ask your PDF") st.header("Ask Your PDF")
pdf = st.file_uploader("Upload your pdf", type="pdf")
if pdf is not None:
pdf_reader = PdfReader(pdf)
text = ""
for page in pdf_reader.pages:
text += page.extract_text()
# split into chunks
text_splitter = CharacterTextSplitter(
separator="\n",
chunk_size=1000,
chunk_overlap=200,
length_function=len
)
chunks = text_splitter.split_text(text)
# create embedding
embeddings = HuggingFaceEmbeddings()
knowledge_base = FAISS.from_texts(chunks, embeddings)
user_question = st.text_input("Ask Question about your PDF:")
if user_question:
docs = knowledge_base.similarity_search(user_question)
# Load the model using Transformers library
model_path = "D:/gpt4free/chat-pdf-hugginface/model"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSeq2SeqLM.from_pretrained(model_path)
# chain = load_qa_chain(model, tokenizer)
chain = load_qa_chain(model, tokenizer, chain_type="map_reduce")
response = chain.run(input_documents=docs, question=user_question)
st.write(response)
if name == 'main': main()
error :
TypeError: load_qa_chain() got multiple values for argument 'chain_type'
Traceback:
File "C:\Users\N_B\Miniconda3\envs\huggin\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 535, in _run_script
exec(code, module.dict)
File "D:\gpt4free\chat-pdf-hugginface\app.py", line 120, in