Where is the CoNLL-2003 formatted Esperanto dataset ref. in the tutorial?
Using a dataset of annotated Esperanto POS tags formatted in the CoNLL-2003 format
Where is this dataset?
Thanks!
It's a synthetic (rule-based) one, we weren't able to find a canonical hand-annotated one. I'll try to upload it when I get a chance.
Thanks much!
On Wed, Mar 4, 2020, 8:25 AM Julien Chaumond [email protected] wrote:
Here's the dataset https://s3.amazonaws.com/datasets.huggingface.co/EsperBERTo/data/pos-train.txt and the labels https://s3.amazonaws.com/datasets.huggingface.co/EsperBERTo/data/pos-labels.txt
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/huggingface/blog/issues/5?email_source=notifications&email_token=AAADNMH5RH3X5JWWQTECIYTRFZ6IBA5CNFSM4KYG4ML2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENYYOZY#issuecomment-594642791, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAADNMC2FOCJCTLOZEJR5PLRFZ6IBANCNFSM4KYG4MLQ .