entity-fishing
entity-fishing copied to clipboard
Support for Swedish language
Hi!
In the list of languages I don't see Swedish. It's a small language, but has a very big wikipedia with ~2.5M articles. Can entity-fishing be trained on swedish, or is there some deeper reason that it's not included?
Hi @EmilStenstrom !
Thank you for the request. Swedish should work well indeed given the size of its Wikipedia. I think it's the largest one not support by entity-fishing yet, with Dutch. It will try to include it in the next batch of supported languages.
That sounds awesome! Looking forward to testing it! :)
Nice! Happy to see it disambiguate Swedish. Looking at that specific example, the things it mentions are not entities, but they are “concepts”. Translated: “year”, “consumption”, “health”. Is that intentional?
Yes that's the goal, every Wikidata entities is disambiguated, based on the Wikipedia anchors - Wikidata calls "entities" the concepts and their instances. We can then refine based on the statements P279 and P31 to select what's wanted for a given task/application. Another one more:
Awesome! Using wikidata statements to select what you want is super powerful. Eager to try this out when 0.0.6 is released! :)