Niko Partanen

Results 20 comments of Niko Partanen

Thanks a lot, @oadams, this helped us tremendously! We have now got very good results with several languages, and still have few mystery cases where things don't work, but the...

Hi again! Now I do have a very urgent question. @oadams, how can we get easily test set LER? We get validation LER during training, of course, but what is...

Great, I'll test it right away! Thanks a lot! We just had a deadline, but we did find a solution to report some scores that made sense. Now I'll do...

Hi! Thanks for reply! I see, I'll keep following the project and mention it to colleagues in Helsinki who work with similar topics. We have quite many books that should...

I got the hocr-proofreader display my files very nicely, and I'll still experiment with it quite a bit. Great work! The bounding box problems seem common to all editors, but...

I can comment that as far as I know the UD corpus should be manually corrected. I think it was converted to the UD format from something else, in which...

Thank you @adbar! We made a small test file for Northern Sámi. The accuracy is around 75%, although the text also had some Finnish words and names. The file is...

I added here a version that contains the Simplelemma predictions in the third row, so it is easier to measure the accuracy and evaluate the current result. https://gist.github.com/nikopartanen/b32f17a6e85dd8ebd02ad24968783a21

I think the current behaviour is about as good as we can get with the current materials. If there are more lemmatized materials somewhere, then training the system with extended...

Certainly! This one breaks because of the North Sámi label: http://www.yso.fi/onto/yso/p3448 This breaks due to the label that marks outdated usage: http://www.yso.fi/onto/yso/p17004 These are two major categories for which we...