annodoc
annodoc copied to clipboard
Tokenize SD input?
The SD parser currently only separates tokens by whitespace, so that e.g. the last token of
~~~ sdparse
foo bar.
dep(foo, bar)
~~~
is bar.
, making the above break as the system can't find the token bar
(without terminal dot). This appears to be a common source of error in manually entered SD analyses.
The possibility of doing e.g. PTB-like tokenization of input should be at least considered.