TopicNet
TopicNet copied to clipboard
Interface for easier topic modelling.
The method is too slow! Do we really need `dask.dataframe`? Maybe better to store documents on disk as single files (and not as one big .csv)? References: * How one...
 Seems more natural for a model to fit on Dataset. Maybe better to use `Union[artm.BatchVectorizer, topicnet.cooking_machine.Dataset]` instead of just `artm.BatchVectorizer` (Union — for compatibility)?
How/To whom to ask a question? Maybe add info about the channel of the library in Slack?
We have a number of dependencies which `setuptools` calls "extras": packages that only needed for unlocking additional functionality but not used otherwise. Ideally, we should not require the user to...
Ideally, each dataset should be available in both forms.
Something along the lines of "convert between `Counter` and `vowpal_wabbit`" would be very helpful. Also, maybe we need to store more metadata (such as main modality and co-occurrences) Related code:...
* what is it for? * what parameters one can/is encouraged to vary using this cube (eg. number of topics maybe better to find using [OptimalNumberOfTopics](https://github.com/machine-intelligence-laboratory/OptimalNumberOfTopics) repository :slightly_smiling_face:)? * how...
Worth to note that relative weights are cool and library provides simple ways to use it. * More intuitive way to choose modality weights (not just some random values virtually...
Currently CubeCreator supports only absolute weights (am I right :slightly_smiling_face: ?). Seems that relative weights are more useful (plus taking into account that `init_simple_default_model` requires relative weights as input). +...
* what is `strategy`? is it important or default value is always OK? how to pick up the right one? * what is `tracked_score_function`? is it important or default value...