Nathan Schneider

Results 95 comments of Nathan Schneider

> Also, MCKINLEY would not be correctly lowercased by this heuristic; that could be a somewhat common case. I went through the all-caps names in EMNLP 2019, and most were...

I don't know how that is checked but it should cover most of the cases. Maybe the rest should require a manual decision to whitelist or truecase.

And the manual decision can usually be made by checking the PDF. Even better if we could scrape the author capitalization from the PDF, but that might be too hard.

Eh...it seems to me the status quo is (unintentionally) discriminatory against people whose surnames are sometimes entered in all-caps, because inconsistencies will make it harder to browse their work. And...

Nice! Could we run this periodically and record the exceptions as having been manually checked?

manually fix https://www.aclweb.org/anthology/N19-1062/ for now? also "Conference on Machine Translation" https://www.aclweb.org/anthology/W19-5301/

Not intimately familiar with this feature but I would have expected name_variants.yaml to contain all variants. Could a new variant specified in the XML trigger an error until it is...

Thanks @mbollmann—I think horizontal scrolling is confusing, but something along the lines of Exhibit Two could work. It might also be nice to have a way to browse by decade.

Digging around in START a bit more I see there is a tool called "Title Case Formatter for Titles/Authors" which suggests capitalization fixes using heuristics. I suggest we recommend this...

> name formatting is drawn from the global profile Only for registered authors. At SemEval we are finding a lot of the papers have unregistered coauthors.