acl-anthology
acl-anthology copied to clipboard
Ingestion: LREC 2024
Build successful. Some useful links:
- Complete site preview: https://preview.aclanthology.org/lrec-2024-ingestion
- Potential volumes of interest: 2024.bucc-1, 2024.cawl-1, 2024.cl4health-1, 2024.cogalex-1, 2024.delite-1, 2024.determit-1, 2024.dlnld-1, 2024.dmr-1, 2024.ecnlp-1, 2024.eurali-1, 2024.finnlp-1, 2024.games-1, 2024.htres-1, 2024.humeval-1, 2024.isa-1, 2024.ldl-1, 2024.legal-1, 2024.lrec-main, 2024.lrec-tutorials, 2024.lt4hala-1, 2024.mathnlp-1, 2024.mwe-1, 2024.neusymbridge-1, 2024.nlperspectives-1, 2024.osact-1, 2024.parlaclarin-1, 2024.politicalnlp-1, 2024.rail-1, 2024.rapid-1, 2024.readi-1, 2024.rfp-1, 2024.safety4convai-1, 2024.signlang-1, 2024.sigul-1, 2024.tdle-1, 2024.trac-1, 2024.unlp-1, 2024.wildre-1
This preview will be removed when the branch is merged.
Some of the papers do not give any information for the Copy Citation: More options…
https://preview.aclanthology.org/lrec-2024-ingestion/2024.eurali-1.4/
Some of the editors have LaTeX in their names
Silvie Cinkov\'{a}
Should we correct these in the CDROM tab?
Some of the papers do not give any information for the
Copy Citation: More options…
preview.aclanthology.org/lrec-2024-ingestion/2024.eurali-1.4
That is expected.
Some of the editors have LaTeX in their names
Silvie Cinkov\'{a}
Should we correct these in the CDROM tab?
Yes. "Theodorus Franseen" also appears to be a typo in that same volume (-> "Fransen").
I removed all LaTeX encoding of accents from the chairs names. It may cause problems, but using UTF-8 encoding directly didn't work for me. So I just remove the accents.
https://softconf.com/lrec-coling2024/cawl2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/cl4health2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/cogalex2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/determit2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/dlnld2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/dmr2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/ecnlp-7/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/eurali2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/finnlp-kdf2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/gamesandnlp2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/htres2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/humeval2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/isa2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/ldl2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/legal2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/lt4hala2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/mathnlp2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/mwe-ud2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/neusymbridge2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/nlperspectives2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/osact2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/parlaclarin-iv/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/politicalnlp2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/rail2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/rapid2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/readi2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/reference-framing-perspective2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/safeconvai2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/sigul2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/tdle2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/trac2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/unlp2024/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/wildre-7/pub/aclpub/proceedings.tgz
As before, the link of the main conference papers and tutorials:
https://softconf.com/lrec-coling2024/tutorials/pub/aclpub/proceedings.tgz
https://softconf.com/lrec-coling2024/papers/pub/aclpub/proceedings.tgz
We are still waiting W04, W20 and W21 workshops.
Can @anthology-assist explain what is wrong in the meta files as wrote in https://github.com/acl-org/acl-anthology/issues/3175#issuecomment-2106342255?
Can @anthology-assist explain what is wrong in the meta files as wrote in #3175 (comment)?
I think they were looking at the versions before we fixed the metadata, so some were very wierd.
@fcbond @arademaker Yes, regarding what's wrong with several workshop's meta file. Below is an example (from W07):
abbrev myconference
title Games and Natural Language Processing 2024
url https://lrec-conf.org/proceedings/lrec2024/games/
booktitle Proceedings of the 10th Workshop on Games and Natural Language Processing @LREC-COLING-2024
shortbooktitle TOBEFILLED-Proceedings of WMT
volume 1
month TOBEFILLED-June
year TOBEFILLED-1
location TOBEFILLED-Ann Arbor, Michigan
publisher Association for Computational Linguistics
Certain information in the meta file is incorrect: e.g. abbrev, shortbooktitle, month and location.
I just checked W07, it still looks wrong.
In terms of author accents fixing, instead of reingest, can we mark the problematic editor names (see above comment as an example).
For all workshops, please consider the tgz from the links for the softconf system I put here.
Hi,
Everything should be good to go.
Just in case it wasn't clear, we did not upload the latest versions for all but three proceedings (4, 20 and 21) as we think it is quicker to just pull from softconf. The final three are still here (they have done extra work to e.g. keep the pdf links alive).
https://drive.google.com/drive/folders/1vBHJGWrgxWyAo5HgvzE6QGnbt1Xrt-ES?usp=sharing
Everything else is in sofconf:
https://softconf.com/lrec-coling2024/cawl2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/cl4health2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/cogalex2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/determit2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/dlnld2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/dmr2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/ecnlp-7/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/eurali2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/finnlp-kdf2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/gamesandnlp2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/htres2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/humeval2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/isa2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/ldl2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/legal2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/lt4hala2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/mathnlp2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/mwe-ud2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/neusymbridge2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/nlperspectives2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/osact2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/parlaclarin-iv/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/politicalnlp2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/rail2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/rapid2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/readi2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/reference-framing-perspective2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/safeconvai2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/sigul2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/tdle2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/trac2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/unlp2024/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/wildre-7/pub/aclpub/proceedings.tgz
As before, the link of the main conference papers and tutorials: gi https://softconf.com/lrec-coling2024/tutorials/pub/aclpub/proceedings.tgz https://softconf.com/lrec-coling2024/papers/pub/aclpub/proceedings.tgz
Please try with these and let us know if there are any problems.
Yours,
On Tue, 14 May 2024 at 01:56, Alexandre Rademaker @.***> wrote:
For all workshops, please consider the tgz from the links for the softconf system I put here.
— Reply to this email directly, view it on GitHub https://github.com/acl-org/acl-anthology/pull/3293#issuecomment-2109019813, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIPZRWGOVBCJZMBQU4RQVLZCFHLLAVCNFSM6AAAAABHRPKVKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBZGAYTSOBRGM . You are receiving this because you were mentioned.Message ID: @.***>
-- Francis Bond https://fcbond.github.io/
Hi,
how is this going?
The conference starts in 5 days, it would be great if it could be there by then (or even better a day before).
@fcbond @arademaker I believe all updated workshops have been reingested and missing workshops have been ingested. If anything is missing please let us know. Waiting on PR approval.
Hi,
I am very sorry, but there are still issues with the following (mainly address and publisher). They are all fixed already, but they need to be re-ingested:
cawl cl4health cogalex determit dlnld dmr ecnlp eurali finnlpkdfeconlp humeval isa ldl legal lt4hala mweud neusymbridge parlaclariniv rail rapid readi safety4convai sigul trac tutorials unlp wildre
That should be it!
We thought you would just do them all, otherwise we would have given you the (very long) list earlier. Sorry again.
Is it mainly address and publisher, or only address and publisher? If the latter, it will be easier to just fix the files here in the import.
Also, this second change raises some uncertainty about the ingestion. How certain are you that everything is correct at this point?
Hi,
There were also some small changes to titles, mainly for uniformity, so would prefer to ingest the whole things.
We are very confident that everything is good now.
Francis
On Fri, May 17, 2024, 13:21 Matt Post @.***> wrote:
Is it mainly address and publisher, or only address and publisher? If the latter, it will be easier to just fix the files here in the import.
Also, this second change raises some uncertainty about the ingestion. How certain are you that everything is correct at this point?
— Reply to this email directly, view it on GitHub https://github.com/acl-org/acl-anthology/pull/3293#issuecomment-2117478792, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIPZRTSBQ7EMCVNORIUG7DZCXY3NAVCNFSM6AAAAABHRPKVKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJXGQ3TQNZZGI . You are receiving this because you were mentioned.Message ID: @.***>
In the tab "files changed" I didn't identified the tutorials. Is it right? By the way, I am travelling tomorrow to Italy to attend the LREC in person, so I may have limited Internet access until Sunday.
@arademaker Can you clarify what you mean? There is a tutorials volume in data/xml/2024.lrec.xml
.
So fine. What is the xml for the whole main conference papers?
So fine. What is the xml for the whole main conference papers?
The same file, data/xml/2024.lrec.xml
.
@arademaker If you search, there are two <volume>
tags. The first is for main conference papers, the second for tutorials.
- [x] Make sure to jointly list the main conference volumes with COLING using an additional
<venue>
tag.
Main conference (main and tutorials) have been ingested here #3304 and alive on anthology https://aclanthology.org/events/lrec-2024/
Workshops will be ingested here.
Should be good to go.
Hi,
you did not reingest:
tutorials cogaglex mweud (which has split venues) finnlpkdfeconlp (which has split venues) paraclariniv (which has a new venue)
These have the wrong address (tutorials and cogalex) and publisher (cogalex, mweud, finnnlpkdfeconlp and paraclarniv).
Should be ELRA and ICCL: Association for Computational Linguistics: 2024.parlaclariniv-1 Association for Computational Linguistics: 2024.finnlpkdfeconlp-1 European Language Resources Association: 2024.mweud-1 Association for Computational Linguistics: 2024.cogalex-1
Should be Torino, Italy: Torino, Italia: 2024.lrec-tutorials.1 Torino, Italia: 2024.finnlpkdfeconlp-1 Torino, Italia: 2024.lrec-tutorials Turin, Italy: 2024.cogalex-1
I am afraid I am travelling now, so it is hard to check if there is anything else we changed for them, it would be much safer just to reingest them (as I requested earlier).
Sorry again for the extra work,
Yours,
On Sun, 19 May 2024 at 04:45, anthology-assist @.***> wrote:
Should be good to go.
— Reply to this email directly, view it on GitHub https://github.com/acl-org/acl-anthology/pull/3293#issuecomment-2119077545, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIPZRSAQPKXJNFZY7VWQYTZDAG6BAVCNFSM6AAAAABHRPKVKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJZGA3TONJUGU . You are receiving this because you were mentioned.Message ID: @.***>
-- Francis Bond https://fcbond.github.io/
Just to note that the city name in English should be Turin and not Torino (that’s the Italian name).
This applies to all LREC-COLING venues.
- M
On Sun, 19 May 2024 at 09:24, Francis Bond @.***> wrote:
Hi,
you did not reingest:
tutorials cogaglex mweud (which has split venues) finnlpkdfeconlp (which has split venues) paraclariniv (which has a new venue)
These have the wrong address (tutorials and cogalex) and publisher (cogalex, mweud, finnnlpkdfeconlp and paraclarniv).
Should be ELRA and ICCL: Association for Computational Linguistics: 2024.parlaclariniv-1 Association for Computational Linguistics: 2024.finnlpkdfeconlp-1 European Language Resources Association: 2024.mweud-1 Association for Computational Linguistics: 2024.cogalex-1
Should be Torino, Italy: Torino, Italia: 2024.lrec-tutorials.1 Torino, Italia: 2024.finnlpkdfeconlp-1 Torino, Italia: 2024.lrec-tutorials Turin, Italy: 2024.cogalex-1
I am afraid I am travelling now, so it is hard to check if there is anything else we changed for them, it would be much safer just to reingest them (as I requested earlier).
Sorry again for the extra work,
Yours,
On Sun, 19 May 2024 at 04:45, anthology-assist @.***> wrote:
Should be good to go.
— Reply to this email directly, view it on GitHub < https://github.com/acl-org/acl-anthology/pull/3293#issuecomment-2119077545>,
or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAIPZRSAQPKXJNFZY7VWQYTZDAG6BAVCNFSM6AAAAABHRPKVKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJZGA3TONJUGU>
. You are receiving this because you were mentioned.Message ID: @.***>
-- Francis Bond https://fcbond.github.io/
— Reply to this email directly, view it on GitHub https://github.com/acl-org/acl-anthology/pull/3293#issuecomment-2119132898, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABU725EHPBJ62EL35HRTR3ZDBHSVAVCNFSM6AAAAABHRPKVKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJZGEZTEOBZHA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
- Joint events do not get a new venue ID: finnlpkdfeconlp and mweud will be associated with their respective venues.
-
paraclarin
is the venue ID, notparaclariniv
- A search-and-replace on the XML files will be faster for fixing the city name:
perl -pi -e "s/Torino, Italia/Turin, Italy/g" 2024.*.xml
- [x] The
2024.coling.xml
file needs to have all the workshops copied into it, so that they appear on its event page.
What is strange to me is that address and publisher should be all the same in the softconf links we shared. We used address Torino following the website. The publisher we use the two acronyms “ELRA and ICCL”. I hope we are really using the last version.
The location is Torino in all the PDFs. Let's be consistent.
At some stage we discussed it, and decided to go with Torino, I can't find the discussion now sorry.
On Sun, May 19, 2024, 09:36 Min-Yen Kan @.***> wrote:
Just to note that the city name in English should be Turin and not Torino (that’s the Italian name).
This applies to all LREC-COLING venues.
- M
On Sun, 19 May 2024 at 09:24, Francis Bond @.***> wrote:
Hi,
you did not reingest:
tutorials cogaglex mweud (which has split venues) finnlpkdfeconlp (which has split venues) paraclariniv (which has a new venue)
These have the wrong address (tutorials and cogalex) and publisher (cogalex, mweud, finnnlpkdfeconlp and paraclarniv).
Should be ELRA and ICCL: Association for Computational Linguistics: 2024.parlaclariniv-1 Association for Computational Linguistics: 2024.finnlpkdfeconlp-1 European Language Resources Association: 2024.mweud-1 Association for Computational Linguistics: 2024.cogalex-1
Should be Torino, Italy: Torino, Italia: 2024.lrec-tutorials.1 Torino, Italia: 2024.finnlpkdfeconlp-1 Torino, Italia: 2024.lrec-tutorials Turin, Italy: 2024.cogalex-1
I am afraid I am travelling now, so it is hard to check if there is anything else we changed for them, it would be much safer just to reingest them (as I requested earlier).
Sorry again for the extra work,
Yours,
On Sun, 19 May 2024 at 04:45, anthology-assist @.***> wrote:
Should be good to go.
— Reply to this email directly, view it on GitHub <
https://github.com/acl-org/acl-anthology/pull/3293#issuecomment-2119077545>,
or unsubscribe <
https://github.com/notifications/unsubscribe-auth/AAIPZRSAQPKXJNFZY7VWQYTZDAG6BAVCNFSM6AAAAABHRPKVKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJZGA3TONJUGU>
. You are receiving this because you were mentioned.Message ID: @.***>
-- Francis Bond https://fcbond.github.io/
— Reply to this email directly, view it on GitHub < https://github.com/acl-org/acl-anthology/pull/3293#issuecomment-2119132898>,
or unsubscribe < https://github.com/notifications/unsubscribe-auth/AABU725EHPBJ62EL35HRTR3ZDBHSVAVCNFSM6AAAAABHRPKVKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJZGEZTEOBZHA>
. You are receiving this because you are subscribed to this thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub https://github.com/acl-org/acl-anthology/pull/3293#issuecomment-2119136283, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIPZRSCNHZ3N5EFODN62PLZDBI7DAVCNFSM6AAAAABHRPKVKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJZGEZTMMRYGM . You are receiving this because you were mentioned.Message ID: @.***>
Five of the proceedings have not yet been reingested.
On Sun, May 19, 2024, 14:08 Alexandre Rademaker @.***> wrote:
What is strange to be is that address and publisher should be all the same in the softconf links we shared. We used address Torino following the website. The publisher we use the two acronymsI hope we are really using the last version.
— Reply to this email directly, view it on GitHub https://github.com/acl-org/acl-anthology/pull/3293#issuecomment-2119213090, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIPZRVTF7D4Q2NY3V5WZCDZDCI5NAVCNFSM6AAAAABHRPKVKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJZGIYTGMBZGA . You are receiving this because you were mentioned.Message ID: @.***>