DocEE
DocEE copied to clipboard
Which path to follow
First of all, thank you for the great dataset. I'm wondering if I want to use your dataset, should I use the data in google drive or the pickle file in the GitHub repo?
I found that there is some inconsistency between the two resources. For example, the first data point in the dev split in Google drive is:
[{'start': 111, 'end': 120, 'type': 'Location/Hospital', 'text': 'Gaza City'}, {'start': 127, 'end': 134, 'type': 'Perpetrator', 'text': 'Israeli'}, {'start': 135, 'end': 149, 'type': 'Death Reason', 'text': 'missile strike'}, {'start': 20, 'end': 38, 'type': 'Deceased', 'text': 'Abdel Aziz Rantisi'}, {'start': 99, 'end': 107, 'type': 'Date', 'text': 'Saturday'}]]
Yet, in the pickle file (from your GitHub repo), we got:
[{"start": 20, "end": 37, "type": "People", "text": "Abdel Aziz Rantisi"}, {"start": 99, "end": 106, "type": "Data", "text": "Saturday"}, {"start": 111, "end": 119, "type": "Location/Hospital", "text": "Gaza City"}, {"start": 127, "end": 133, "type": "Perpetrator", "text": "Israeli"}, {"start": 135, "end": 148, "type": "Causes", "text": "missile strike"}]'
It's clear that the endpoint annotation and some types are different.
This makes me a little bit confused.
Thank you in advance.
Sorry to confuse you. Please use DocEE from Google drive (the first in your description).
Thank you.
One additional question, the google drive's data is created by following the latest event schema, right?
yes
---Original--- From: @.> Date: Tue, Oct 4, 2022 01:27 AM To: @.>; Cc: @.@.>; Subject: Re: [tongmeihan1995/DocEE] Which path to follow (Issue #3)
Thank you.
One additional question, the google drive's data is created by following the latest event schema, right?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>