Matyáš Kopp comments

Results 82 comments of


                                            Matyáš Kopp

RO Feedback

### U+0096 (SPA) Unicode Character - [ ] remove character This character is allowed in ParlaMint, but it causes problems in linguistic annotations, I suggest removing it from the text:...

RO Feedback

### Named entities - [ ] named entities contains non-proper names I guess you are using a model that labels not only named entities from PER/LOC/ORG/MISC set but also DATE...

### shifted NEs ? - [ ] shifted NEs In this paragraph (ParlaMint-RO_2000-10-24-id4980.u2.seg8.2), NEs seem to be shifted. https://raw.githubusercontent.com/clarin-eric/ParlaMint/3f2d0a820d31aa7e55b72156089a3450b303e3bc/Data/ParlaMint-RO/ParlaMint-RO_2000-10-24-id4980.ana.xml reformated and remove token elements (`w` and `pc`) ```XML atitudinea autorităţilor...

RO Feedback

### Voci din sală: in utterance - [ ] voice from the hall https://github.com/romanian-parlamint/ParlaMint/blob/a510c149ba04407fe6df77414b3a2aaec6f47022/Data/ParlaMint-RO/ParlaMint-RO_2000-10-24-id4980.xml#L408-L414 ```XML Domnul Vasile Lupu: Să vedem cine îl face. (Rumoare în partea stângă a sălii) Dar,...

RO Feedback

### person - affiliation - organization - [ ] parliamentary groups - [ ] only one virtual parliamentary group `Placeholder parliamentary group` - [ ] government I guess you are...

RO Feedback

### strange UPosTag `_` when `Mc-s-d` - [ ] UPosTag of digit tokens `Mc-s-d` Every token with `pos="Mc-s-d"` has wrong `msd="UPosTag=_"`. sample: ```XML 1990 ``` You can fix this with...

PL: many transcriber comments not annotated

I probably don't understand the point of this issue... PL corpus is correctly encoded: https://github.com/clarin-eric/ParlaMint/blob/3c8ad8aeab6d854cdd5e9113115b944e37d7e6d9/ParlaMint-PL/ParlaMint-PL_2018-09-27-senat-65-2.ana.xml#L396-L403 In this case, `kinesic` is within the `u` (speech element) because it happens during the...