Sem-Dialogue
Sem-Dialogue copied to clipboard
Data preprocessing
Hi, I am trying to reproduce the preprocessing pipeline in #1, and I have a few questions:
- For neural coreference, what setting was used? Was the speaker name(speaker1,2) included in the text? https://huggingface.co/coref/?text=Could%20I%20have%20my%20bill%2C%20please%3F%20Certainly%2C%20sir.%20I%E2%80%99m%20afraid%20there%20has%20been%20a%20mistake.What%20could%20it%20be%3F using this demo it seems to produce different result than what's in the paper.
- For step 6, how were the edges added? were them added directly to the amr or encoded in some other way?
- For AMR-simplifier, what's the difference between the file posted in the link and the neural-amr repo(https://github.com/sinantie/NeuralAmr)? Thanks!
Hi,
Thanks for your interest. For coreference, we append the speaker names to the beginning of each utterance before applying the neural coreference tool. To ensure the quality, we set the hyper-parameter greedyness as 0.3 (You may need to try other values when using for other datasets). For the second question, we directly apply the edges to the AMR graph. For the third question, the tool we provide is adapted from the neural-amr,with some example scripts for AMR simplification. You can use the original neural-amr if you are familiar with that.