Lukas Schmelzeisen
Lukas Schmelzeisen
As it turns our we are sometimes returning single probabilities greater one, which is obviously garbage. I did some calculations by hand and the error does not seem to be...
NGramTimes for sequences like `000` or `xxxx` are always zero. Why?
In `bachelor-thesis` branch Estimator `INTERPOL_ABS_DISCOUNT_MLE_SKP_NOREC` fails the `EstimatorTest` because `Estimator#getRequiredCache()` does not always turn the right results, most likely because of unlucky random.
Currently `ProbMode` is hardcoded to `ProbMode.MARG` in `GlmtkExecutable`. This should be configurable via Cmd-Line-API. However currently only `ProbMode.Marg` returns non test failing results, so this only makes sense once Estimtars...
You told me `FalsemaximumLikelihood`-estimation (namely doing `c(a b c)/c(a b)` instead of `c(a b c)/c(a b _)`) should work in the marginal case. It doesn't though. We should research why.
To save memory and have cleaner code.
Currently reserved symbols are _ (absolute skip), % (continuation skip) / (token-pos-separator). IIRC the program fails if any of these are contained in training or querying files. How do we...