Osma Suominen

Results 374 comments of Osma Suominen

Hi @thomaslow , thank you for the issue report. You're right that the tfidf backend builds a model with all the subjects, even those not referenced in training data. The...

Thanks, I understand. Good to hear that you're also experimenting with other backends. I recommend taking a close look at Omikuji, since at least for us it has consistently achieved...

Thanks @thomaslow ! There is a method for cross-validation in the [Maui Server REST API](https://github.com/TopQuadrant/MauiServer/blob/master/API.md#resource-tagger-cross-validation): > URL pattern: /{tagger-id}/xvalidate > This works similar to training, but instead of training and...

> I'm not sure how Maui Server splits the data - is it done intelligently, trying to ensure that rare labels are evenly split, or just randomly. Responding to myself:...

Thank you for the references to interesting papers papers @mfakaehler ! Currently the splitting of data sets is always performed outside Annif. I think it could be useful to provide...

We spent some time debugging this with @miguelahonen . The problem seems to be in jena-text - it doesn't properly support languages with subtags. For the alphabetical index, with content...

> it shows a label for a datatype only in the default language of your configuration. More specifically, the code looks for a label in the current content language, but...

Makes sense. It's an accident of history that the vocabulary page has a different mechanism than the concept page.

This PR is a bit old, I think it would be best to merge the changes from `master` and possibly rebase as well

> I couldn't load AGROVOC to test reified properties. @osma would you have another smaller example dataset that I can load to test it, please? Can you use the snippet...