gensim icon indicating copy to clipboard operation
gensim copied to clipboard

Added flsamodel.py and added a dependency

Open ERijck opened this issue 3 years ago • 2 comments

FLSA-W is a new and state-of-the-art topic modeling algorithm (https://ieeexplore.ieee.org/abstract/document/9660139, ). I have added flsamodel.py, a script with which the FuzzyTM package is called with parameters and methods similar to ldamodel.py. In the setup.py file, I have added FuzzyTM as a dependency.

ERijck avatar Aug 19 '22 13:08 ERijck

Given that all the real work is done in the FuzzyTM module, & that looks pretty easy for any user to import/use directly, I don't see much value in this small FlsaModel wrapper class - but it does add extra costs/risks via coupling/extra-dependencies.

To the extent the corpus-handling helper method(s) might make it easier for people using Gensim models to also compare results from FLSA or FLSA_W:

  • could the helper (mainly _convert_bow()) be a generic helper-function that don't require importing anything from FuzzyTM?
  • could any classes/functions live in FuzzyTM package instead, possibly even providing a Gensim-shaped workalike model from there (without even explicitly importing Gensim code)?

gojomo avatar Sep 02 '22 19:09 gojomo

On top of what gojomo said, FuzzyTM is also very new: https://github.com/ERijck/FuzzyTM

So I think this PR is at best premature.

mpenkov avatar Sep 05 '22 07:09 mpenkov

I have a new version with an extra model (FLSA-W) without a FuzzyTM and sparsesvd dependency. Hence, I will close this pull request and open a new one.

ERijck avatar Oct 28 '22 12:10 ERijck