genie-toolkit Add evaluation metric for accuracy except device IDs

Add evaluation metric for accuracy except device IDs

Open gcampax opened this issue 2 years ago • 0 comments

As discussed in the meeting with Kevin. For most skills (practically, all but IoTs) device IDs don't need to be correct at parse time because they can be added automatically as postprocessing.

We still want to include them as "exact match accuracy" because they are part of the target program. We want pre-normalization token-by-token exact match accuracy to be the target metric due to how seq2seq works.

To account for this, and remove some less-relevant errors from error analysis of devices, we should introduce a new partial accuracy metric, "ok_without_device_id". This would be implemented in SentenceEvaluator and in the associated cmdline code. We can implement this using some token manipulation (recognizing the sequence of tokens "id = GENERIC_ENTITY_*" and removing it), or with a proper NodeVisitor that visits all DeviceSelectors and sets the id to null.

Aug 26 '21 09:08 gcampax

genie-toolkit genie-toolkit copied to clipboard

Add evaluation metric for accuracy except device IDs

genie-toolkit
genie-toolkit copied to clipboard