trailing curly brace being stripped from slot value
I have some values in slots that are surrounded by curly braces and are meant to be returned as is. Instead, the trailing brace is being stripped. "${website}" becomes "${website". I have training examples where the whole "${website}" is included. Is there a way to change this behavior?
@Shotgun167 , This is indeed a limitation due to the current tokenization which strips some punctuation. However, the "${website}" value should still be retrieved in the resolved value field:
{
"input": "go to ${website}",
"intent": {
"intentName": "go_to_url",
"probability": 1.0
},
"slots": [
{
"entity": "url",
"range": {
"end": 15,
"start": 6
},
"rawValue": "${website", # TRUNCATED VALUE HERE
"slotName": "url",
"value": {
"kind": "Custom",
"value": "${website}" # FULL VALUE HERE
}
}
]
}
The plan (mid-term) is to have a tokenizer component which will be customized through the NLU configuration.
I am working around it right now. I substitute in a crazy string for the trailing punctuating before parsing, and then swap it back out of the response. This is a nasty, ugly hack that makes code reviewers cry. So, I look forward to the customizable parser.
Is it possible to guesstimate a timeframe? And, yes, I do have time to offer help, though I will not swear that I have the relevant expertise.
It is not prioritized yet so I can't give you a good ETA, but I think this could be done within the next 3 months.