pypi-support
pypi-support copied to clipboard
File Limit Request: lima-python - 450 MB
Project URL
https://pypi.org/project/aymara
Does this project already exist?
- [X] Yes
New Limit
450
Update issue title
- [X] I have updated the title.
Which indexes
PyPI
About the project
The LIMA linguistic analyzer was started in 2002 and is free software since 2014. It was rule-based and now includes deep learning-based modules. It is developed in C++. Its python binding, this project, was initially just a wrapper executing the c++ native executable and was then changed to use a docker container. This new version is a real python binding based on manylinux_2_28. But as LIMA depends on several external libraries (Qt, boost, icu, tensorflow and NOW libtorch), it must bring with it binary versions of all these libraries. This represents a large part of current 407 MB wheel (1.2 GB unziped). In fact, the newly added libtorch_cpu alone is 492MB unzipped. The remaining unziped content are mainly compiled linguistic resources and configuration files that can hardly be distributed separately. The main deep learning models are downloaded using an integrated script.
Reasons for the request
I see no other reasonable way to distribute a working tool than to include all the necessary libraries. It ensures that it will run smoothly. In the previous request for size (300M at this time), I wrote that "it should not increase too much in size as only the libtorch library should be added soon. In fact, some of the old rule-based libraries and resources will become obsolete and it should go down in size in the future.". Well, now libtorch is added and it is large. And currently it is not possible to remove the old features as everything is not in par with the libtorch-based version. I hope to be able to remove it in the current year.
All the dockerfiles allowing to build this project and thus other Qt and/or boost and/or icu, etc -based projects are made available here: https://github.com/aymara/lima/tree/master/continuous_integration and here: https://github.com/aymara/lima-python
Code of Conduct
- [X] I agree to follow the PSF Code of Conduct