Marcel Bollmann
Marcel Bollmann
With the addition of preformatted citation strings (#1390), we're now using citeproc to generate a reference string in ACL bibliography format. We'd like to keep citeproc for now in order...
In paper titles with multiple sentences, the first word of non-initial sentences should probably be ``d. This happens e.g. here: [How Good is Your Tokenizer? On the Monolingual Performance of...
> I realise I have no power here, but it seems unintuitive for all datasets a paper uses & introduces to be listed on the anthology without distinction. Would a...
This thread is intended to collect all feedback, suggestions, bug reports, etc. for the new Anthology website in the [`static-rewrite`](https://github.com/acl-org/acl-anthology/tree/static-rewrite) branch. (Edit: [live demo here at http://aclweb.org/anthology](http://aclweb.org/anthology)) **If you do...
Many PACLIC proceedings have URLs in their `` entry in the XML, not DOIs. This fixes that. Technically, the current entries are _Handle_ URLs, not _DOI_ URLs, but from spot-checking...
The [data/yaml/joint.yaml](https://github.com/acl-org/acl-anthology/blob/master/data/yaml/joint.yaml) file is a repeated source of confusion. While @mjpost recently started a [wiki page](https://github.com/acl-org/acl-anthology/wiki/Venues,-Volumes,-and-Events) that (also) describes how it's used, I wonder if we shouldn't refactor this to...
**TL;DR:** If name variants are defined in the XML, whether a variant is considered part of the "canonical name" depends on the order in which the XML files are read....
There was some discussion on whether we should make [our `anthology` library](https://github.com/acl-org/acl-anthology/tree/master/bin/anthology) into a PyPi package. This would make it easier for people to use our Python interface to the...
I have crosschecked a full file list from the aclweb.org server (created by @mjpost on 29.03.2019) with what would be expected after parsing the Anthology XML. The result is a...
When computing `logsum_alt`, the frequency of a removed piece is re-assigned to alternatives: https://github.com/google/sentencepiece/blob/ba7e11a17f606327d0652528d58d2dd8cd265c6f/src/unigram_model_trainer.cc#L389-L394 But the code uses `alternatives.size()` which, if I'm not mistaken, is always equal to `sentencepieces.size()`. Don't...