topic-modeling-tool icon indicating copy to clipboard operation
topic-modeling-tool copied to clipboard

CSV file as a set of documents?

Open StephenQuirolgico opened this issue 4 years ago • 1 comments

Is it possible to use a CSV file as the set of input documents (i.e., where each row in the CSV file represents a different document)? We have a dataset containing thousands of documents and it's not practical to have each of these as a separate text file.

StephenQuirolgico avatar Aug 24 '20 13:08 StephenQuirolgico

This is a feature of MALLET, and it used to be available in the TMT, but it proved difficult to maintain the tool while allowing both modes of input. However, we've done some refactoring since then, and it might be easier now. I've been thinking about this for a while and will look into it — thanks for the suggestion!

senderle avatar Sep 18 '20 19:09 senderle