Jangwon Park comments

Results 8 comments of


                                            Jangwon Park

cased issue on Huggingface transformers tokenizer

> Hey @monologg, can you try using `allenai/scibert_scivocab_uncased`? These two models actually have different vocabularies/weights, so it's not just a matter of different Tokenizer setting. ```python >>> from transformers import...

faster prediction using GoEmotions-pytorch models based on bert-mini, bert-small or bert-tiny

If you use the vocab from `prajjwal/bert-mini` when training, then you should also change the `tokenizer_name_or_path`

Finetune POS

> Hi @monologg, thank's for your great work! I was trying to play around with your model on huggingface but I got this error `Can't load config for 'monologg/koelectra-base-finetuned-naver-ner'. Make...

Finetune POS

> Also, I wanted to know if you were willing to collaborate on a finetuned pos model? My understanding is that we need a conllu dataset such as [UD_Korean-GSD](https://github.com/UniversalDependencies/UD_Korean-GSD) and...

Finetune POS

@sachaarbonel I'll see [the dataset you shared](https://github.com/UniversalDependencies/UD_Korean-GSD) and let you know how to use this one to make finetuned model:)

Finetune POS

> @monologg sorry for the delay. I didn't use your package locally but through [huggingface hosted api](https://huggingface.co/monologg/koelectra-base-finetuned-naver-ner?text=%EB%AC%BC+%EC%A2%80+%EC%A4%98.). Sadly, this pipeline is only for locally usage. > About "the directions of...

JupyterNotebook Example

- directory 자체를 path로 주는 경우 - file 자체를 path로 주는 경우 (glob 예제)

kdlf.Reader 로 로딩 시 전체 데이터 개수를 알아낼 수 있도록

해당 기능이 없이 데이터가 저장된 경우 (`ko_lm_dataformat==0.1.0`) 와 conflict이 생기지 않도록 해야함