smbslt3 issues

Results 5 issues of


                                            smbslt3

sent_to_word_contexts_matrix 내 dynamic_weight 관련 문제

```python from soynlp.vectorizer import sent_to_word_contexts_matrix x, idx2vocab = sent_to_word_contexts_matrix( corpus, windows=3, min_tf=10, tokenizer=tokenizer, # (default) lambda x:x.split(), dynamic_weight=False, verbose=True ) ``` 위 코드와 https://lovit.github.io/nlp/representation/2018/09/05/glove/ 게시물을 참고하여 Glove를 학습시키는데, dynamic_weight를 False로...

emoticon_normalize 관련 문제

`스토리ㅋㅋㅋㅋㅋ`를 `emoticon_normalize`를 이용해 normalize하면 `스토리ㅋㅋㅋ`가 되지 않고 `스토ㅋㅋㅋ`가 되는 문제가 있습니다. `스토맄ㅋㅋㅋ`여도 `스토리ㅋㅋㅋ`가 되어야 하는게 맞는 듯 한데, 문제가 있어보입니다

토크나이즈 이슈입니다

안녕하세요 윈도우 10 환경에서 eunjeon으로 Mecab을 사용중입니다. 코퍼스 분석 중에 에러가 발생하는 것을 확인하고, 글 남깁니다. `캘리` `에듀` 라는 단어가, 뒤에 공백이 올 경우 공백까지 포함하여 토크나이즈가 됩니다. 즉, `캘리`...

Error of unknown cause during calculating JS divergence

**Describe the bug** A clear and concise description of what the bug is. When I calculate Janson-shannon divergence with `dit.divergences.jensen_shannon_divergence([foo,bar])`, it keep returns `TypeError: __str__ returned non-string (type InvalidNormalization)` even...

Excessive Duplicated Sentences in LIME Text Output

I'm using LIME text to explain the results of sentiment analysis. When testing various sentences, I've noticed an excessive number of duplicated sentences being used as inputs for LIME text....