tweetopic icon indicating copy to clipboard operation
tweetopic copied to clipboard

Is this topic model suitable for modeling chinese corpus?

Open Decade-rider opened this issue 1 year ago • 4 comments

Decade-rider avatar Aug 29 '24 13:08 Decade-rider

Sorry for the late reply! I'd definitely say the library is usable for Chinese. You will, however need a Chinese tokenizer to be able to segments texts in a meaningful way. You can, for instance use the jieba library. Otherwise I see no obstacle that would prevent you from applying it to Chinese texts.

x-tabdeveloping avatar Mar 15 '25 14:03 x-tabdeveloping

Sorry for the late reply! I'd definitely say the library is usable for Chinese. You will, however need a Chinese tokenizer to be able to segments texts in a meaningful way. You can, for instance use the jieba library. Otherwise I see no obstacle that would prevent you from applying it to Chinese texts.

Thanks for your patient explanation,I will consider using this topic-modeling in my future research.

Decade-rider avatar Mar 15 '25 14:03 Decade-rider

I have, in fact, written a whole Medium article about topic modelling in Chinese, you should consider reading it :D It uses a novel topic model of ours, called KeyNMF and uses representations from transformer models.

Contextual Topic Modelling in Chinese Copora with KeyNMF

x-tabdeveloping avatar Mar 15 '25 14:03 x-tabdeveloping

That's great. I will learn about this model. It's a great honour to learn about your research results.

发自我的iPad

------------------ Original ------------------ From: Márton Kardos @.> Date: Sat,Mar 15,2025 10:54 PM To: centre-for-humanities-computing/tweetopic @.> Cc: Chonglin Pan @.>, Author @.> Subject: Re: [centre-for-humanities-computing/tweetopic] Is this topic modelsuitable for modeling chinese corpus? (Issue #22)

I have, in fact, written a whole Medium article about topic modelling in Chinese, you should consider reading it :D It uses a novel topic model of ours, called KeyNMF and uses representations from transformer models.

Contextual Topic Modelling in Chinese Copora with KeyNMF

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***> x-tabdeveloping left a comment (centre-for-humanities-computing/tweetopic#22)

I have, in fact, written a whole Medium article about topic modelling in Chinese, you should consider reading it :D It uses a novel topic model of ours, called KeyNMF and uses representations from transformer models.

Contextual Topic Modelling in Chinese Copora with KeyNMF

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Decade-rider avatar Mar 15 '25 15:03 Decade-rider