cdQA icon indicating copy to clipboard operation
cdQA copied to clipboard

NotFittedError: BM25Vectorizer - Vocabulary wasn't fitted.

Open fin-amal-joseph opened this issue 5 years ago • 3 comments

I 've got an issue when predicting adding sample code and error below

cdqa_pipeline = QAPipeline(reader='bert_models/bert_qa.joblib') cdqa_pipeline.fit_reader('bert_models/SQuAD_1.1/train-v1.1.json') cdqa_pipeline = QAPipeline(reader='bert_out.joblib') cdqa_pipeline.predict(query="Who is chaplin?")


NotFittedError Traceback (most recent call last) in () ----> 1 cdqa_pipeline.predict(query="Who is chaplin?")

/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in check_is_fitted(estimator, attributes, msg, all_or_any) 912 913 if not all_or_any([hasattr(estimator, attr) for attr in attributes]): --> 914 raise NotFittedError(msg % {'name': type(estimator).name}) 915 916

NotFittedError: BM25Vectorizer - Vocabulary wasn't fitted.

fin-amal-joseph avatar Dec 19 '19 09:12 fin-amal-joseph

I experience the same problem when I load my new dataset. Is this fix?

tianpaul01 avatar Jan 10 '20 01:01 tianpaul01

You have to fit the pipeline retriever to the dataframe with the documents before calling the predict() function:

cdqa_pipeline.fit_retriever(df=df)

fmikaelian avatar Jan 12 '20 12:01 fmikaelian

I had a similar problem because I thought cdqa_pipeline.dump_reader() would dump the trained retriever too.

File A

from cdqa.pipeline import QAPipeline

cdqa_pipeline = QAPipeline(reader='./models/distilbert_qa.joblib')
cdqa_pipeline.fit_retriever(df=df)

cdqa_pipeline.dump_reader('./models/distilbert_qa_fine_tuned.joblib')

File B

from cdqa.pipeline import QAPipeline

cdqa_pipeline = QAPipeline(reader='./models/distilbert_qa_fine_tuned.joblib')
cdqa_pipeline.fit_retriever(df=df)

# fix:
# cdqa_pipeline = QAPipeline(reader='./models/distilbert_qa.joblib')
# cdqa_pipeline = cdqa_pipeline.fit_retriever(df=df)

while True:
    q = input('Question: ')
    answer = cdqa_pipeline.predict(query=q)
    print(answer)

sftblw avatar Mar 16 '20 02:03 sftblw