Silvia Terragni

Results 27 comments of Silvia Terragni

Hi Luke, currently I don't have much time to dedicate to this project. Since this is a super busy period for me, I'm focusing more on the maintenance (as much...

Hi! Thank you :) Yes, we can definitely think of a way to integrate D-ETM as well. We have already integrated ETM, so I think it shouldn't be that hard....

Hi Mariana! Thanks for reporting this issue. I tried to reproduce the error using your code and some other data, but the error doesn't occur. Can you please share your...

Hi A11en0, can you please share your code, version of the library, your python version, and your operating system? I'd be happy to help to solve the issue

Hello @alyrazik, could you send me the dataset (if possible) by email? I would really like to replicate this error but it has never happened with my data. So I...

Thanks for open the specific issue, because I had lost the question. Yes, I confirm that there's currently no way to load the unpreprocessed corpus. As mentioned before, this would...

Thanks Roberta! :) yes, that is correct. My suggestion is to first try hyperparameter configurations that "usually" work well. You can find some reference values in these papers: - https://aclanthology.org/2021.acl-short.96/...

Hello, how many words does the larger vocabulary contain approximately? We integrated ETM in OCTIS but we kept the original implementation, which is not optimized for large corpora. My suggestion...

Hi! Lemmatization is definitely the biggest bottleneck for preprocessing. I didn't know Spacy pipes. It seems the right solution for us, since we already rely on Spacy for the lemmatization....

Thank you! Let me know if you have any questions. Silvia