question_generation
question_generation copied to clipboard
Answer Aware Question Generation with Pretrained Model
Fantastic job!
I am wondering how to generate questions with answer given using the pretrained model (either prepend or highlight), i.e,
nlp = pipeline("question-generation", model="valhalla/t5-small-qg-prepend", qg_format="prepend")
nlp("Two <sep> cans of chili with beans made by Chili Man.")
or
nlp = pipeline("question-generation", model="valhalla/t5-small-qg-hl", qg_format="highlight")
nlp("Two cans of chili with beans made by <hl> Chili Man <hl>.")
From my point of view, I need to disable the answers generation since I already have the answer given from the sentence. I tried to comment the line https://github.com/patil-suraj/question_generation/blob/bffa0a51e3ecba3922cafd13f424521135677303/pipelines.py#L51. However, I am not sure whether it is reasonable or not. I would appreciate it if you can give some advice.
@HenryJunW you search solution?
Goo point! The thing is that sometimes, the existing model is not able to generate the questions with the answer I want, since the answer is normally extracted from the text span.
You can generate in this way:
gen_tok=["generate question: [hl] Python [hl] is an interpreted, high-level, general-purpose programming language. Created by Guido van Rossum and first released in 1991, Python's design philosophy emphasizes code readability with its notable use of significant whitespace."] tokenized=tokenize(gen_tok) input_ids,attention_mask=tokenized['input_ids'],tokenized['attention_mask']
pred=t5_model.generate( input_ids=input_ids.to(device), attention_mask=attention_mask.to(device), max_length=32, num_beams=4, ) print(tokenizer.decode(pred[0],skip_special_tokens=True))
The output was: What is an interpreted, high-level, general-purpose programming language?
The highlighted portion is the answer. Note :Replace [ with < in actual code.I put here square brackets because it is a HTML tag and it gets removed in comment.