udar
udar copied to clipboard
add alternative output formats
This may not be possible in every case, but where possible, add other common output formats:
connl(x/u)mystemMultext-East(Sharoff, et al.)- etc?
As for connl-u format, there does not appear to be any way to represent ambiguity, so the conversion would be lossy.
mystem can have ambiguous readings separated by | in its output, even with the -d (disambiguate) flag:
$ echo "Мы уже работаем здесь три недели." | mystem3.1 -ind
Мы{мы=SPRO,мн,1-л=им}
уже{уже=ADV=}
работаем{работать=V,несов,нп=непрош,мн,изъяв,1-л}
здесь{здесь=ADVPRO=}
три{три=NUM=им|три=NUM=вин,неод}
недели{неделя=S,жен,неод=вин,мн|неделя=S,жен,неод=род,ед|неделя=S,жен,неод=им,мн}