TopClus
TopClus copied to clipboard
Request for using TopClus on different pretrained language models
Hi,
I've read your paper and I like this approach. Thank you for sharing the code. I've one question regarding the pretrained language models (PLMs) that you use for getting the contextualized word representations. I saw in the source code that the model you use is fixed, and it's the classical 'bert-base-uncased':
https://github.com/yumeng5/TopClus/blob/01e22fb73262bc45d361ec9165bdadbd929ac9a5/src/trainer.py#L22
Suppose I'm interested on using this method on a corpus of italian texts. In that case, would it be possible to change this model and use a bert-base-multilingual-uncased
instead?
If that's possible, can we make pretrained_lm
a parameter of the TopClusTrainer
?
Thank you.