preprocess
preprocess copied to clipboard
truecaser not identical to perl script
On input ->
the Moses truecase script does - >
but the C++ does ->
. The additional space seems to appear regardless of what is before >
.
But the tokenizer is supposed to change those to < and > so it probably doesn't matter. (XML support is out of scope for the C++ version)