TopicNet icon indicating copy to clipboard operation
TopicNet copied to clipboard

Dataset's dictionary not updated if one changes the collection dynamically

Open Alvant opened this issue 5 years ago • 2 comments

  • Create a dataset
  • Call dataset.get_dictionary()
  • Change dataset's _data by renaming one of modalities (eg. lemmatized -> new_lemmatized)
  • Try to build a topic model using the dataset

Result: old modality in model's Phi Expected: new modality in Phi

P.S. One should also check that dataset._modalities is up-to-date

Alvant avatar May 12 '20 07:05 Alvant

See? Exactly - if you change the _ variables or methods we don't guarantee proper functionality. You should know what you are doing when using those

Evgeny-Egorov-Projects avatar May 25 '20 10:05 Evgeny-Egorov-Projects

Well... yees... Ok. Then it should be clearly stated in the docstring that currently Dataset provides no way to modify the contents of a text collection (changing document/renaming document/adding modality/...). If you want to change something — use text editors, pandas, csv or something else.

Alvant avatar May 25 '20 14:05 Alvant