ConvLab-2 icon indicating copy to clipboard operation
ConvLab-2 copied to clipboard

RULEDST evaluation

Open IreneSucameli opened this issue 2 years ago • 9 comments

Hi, could you please provide more information on how the Rule DST module is evaluated? Thanks

IreneSucameli avatar Mar 01 '22 09:03 IreneSucameli

We did not evaluate the rule DST solely since it needs dialog acts as input. If you want to compare rule DST with other DST models, you may use the golden dialog acts as input or use an NLU model such as BERTNLU to parse both user and system acts.

zqwerty avatar Mar 10 '22 06:03 zqwerty

I would like to use the output of BERTNLU as the input for the dst; however, it is not clear for me how to pass the data from one module to another, and I haven't find any code for that in convlab, for the moment.

Could you kindly link the convlab's page where this is described, or provide me more information about this process?

IreneSucameli avatar Mar 21 '22 09:03 IreneSucameli

You can refer to the Colab tutorial or the interface class for nlu and dst. You can see PipelineAgent for how to build an agent with modules. Example usage: https://github.com/thu-coai/ConvLab-2/blob/master/tests/test_BERTNLU-RuleDST-RulePolicy-TemplateNLG.py

zqwerty avatar Mar 21 '22 12:03 zqwerty

Thank you for the info. Nevertheless, the Colab tutorial refers to an overall evaluation (nlu + dst+ nlg). What if I would like to evaluate the nlu+dst only, in order to analyze if the defined rules are ok or need some improvements? Is that possible? Thanks again

IreneSucameli avatar Mar 21 '22 14:03 IreneSucameli

Sure. Just feed the output of NLU to DST:

https://github.com/thu-coai/ConvLab-2/blob/ad32b76022fa29cbc2f24cbefbb855b60492985e/convlab2/dialog_agent/agent.py#L122-L132

zqwerty avatar Mar 21 '22 14:03 zqwerty

From the code you posted it doesn't seem that the module is evaluated with F1 scores or a similar measure... perhaps I don't understand your point...

IreneSucameli avatar Mar 22 '22 14:03 IreneSucameli

Sorry, I thought you need instruction about how to pass the output of NLU to DST. If you want to evaluate NLU+DST, you can write a script to: 1) read the original data; 2) pass utterances to NLU to get the user dialog acts; 3) pass user dialog acts to RuleDST to get predicted state; 4) compare predictions with references

zqwerty avatar Mar 23 '22 09:03 zqwerty

refer to https://github.com/thu-coai/ConvLab-2/blob/master/convlab2/dst/evaluate.py for dst metric

zqwerty avatar Mar 23 '22 09:03 zqwerty

Ok, thanks, I'll try in this way!

IreneSucameli avatar Mar 23 '22 14:03 IreneSucameli