Osma Suominen

Results 374 comments of Osma Suominen

Reported the huge memory usage in Simplemma as https://github.com/adbar/simplemma/issues/19

> @osma My library is slower because it is written in pure Python. pycld3 is written in C++ and simplemma uses [mypyc](https://github.com/mypyc/mypyc) to compile the Python modules to C extensions....

I realized that I can just run the Omikuji evaluation part again with Lingua 1.1.3, without redoing the whole benchmark. Hang on...

@pemistahl I upgraded to Lingua 1.1.3 and reran the Omikuji and MLLM evaluations. The Omikuji evaluation runtime decreased from 935 to 856 seconds and the MLLM runtime from 1210 to...

I finished the (partial) benchmark of Lingua in high-accuracy mode and edited the results table above accordingly. The runtime was at least an order of magnitude larger than in low-accuracy...

Thanks for the tip @adbar , I wasn't aware of hyperfine. Though it seems to me it will only measure execution time, not memory usage.

> This is absolutely reasonable. Then Lingua is simply not the right tool for your job. That's ok. Luckily, there are enough language detectors to choose from, especially in the...

Interesting idea. I'm a bit torn on this. I've never thought of the Annif web UI as a separate project, more of an administrative interface for testing models. Very similar...

Thank you for the suggestion. It shouldn't be too hard to implement, but I wonder how this would be triggered. It should happen through the REST API, I think, because...

After switching to sparse vectors (#379) in the PAV backend the RAM usage is now much lower, so this is not so crucial anymore.