natural
natural copied to clipboard
Bug report in SequenceTokenizerNew
SequenceTokenizerNew fails on following call:
sentenceTokenizer.tokenize('"All ticketed passengers should now be in the Blue Concourse sleep lounge. Make sure your validation papers are in order. Thank you". The upstairs lounge was not at all grungy.') (quote from "The Jaunt" by Stephen King)
with following message:
{
"message": "Expected [ \\t\\n\\r.?!] or [)\\]}\"'`’] but \"M\" found.",
"expected": [
{
"type": "class",
"parts": [
" ",
"\t",
"\n",
"\r",
".",
"?",
"!"
],
"inverted": false,
"ignoreCase": false
},
{
"type": "class",
"parts": [
")",
"]",
"}",
"\"",
"'",
"`",
"’"
],
"inverted": false,
"ignoreCase": false
}
],
"found": "M",
"location": {
"start": {
"offset": 75,
"line": 1,
"column": 76
},
"end": {
"offset": 76,
"line": 1,
"column": 77
}
},
"name": "SyntaxError"
}