edge-oriented-graph icon indicating copy to clipboard operation
edge-oriented-graph copied to clipboard

Can the processed versions of these two datasets be used in the same way as the Docred dataset?

Open wucui5 opened this issue 1 year ago • 4 comments

Hello, sorry to bother you. I would like to ask if the CDR and GDA datasets processed by your data_processing program can be used on other document-level relation extraction models like the publicly available dataset Docred. After processing, I noticed that the files are in the .data format, while Docred dataset is in the .json format. The formats seem to be different on both sides. Any help would be greatly appreciated. image image

wucui5 avatar Jul 20 '23 12:07 wucui5

image The data I have processed looks strange to me, and I'm not sure how to use it in other models.

wucui5 avatar Jul 21 '23 02:07 wucui5

The data_processing script produces a specific data format used for this paper. You can convert the data to a format of your choice by editing the script, if you want compatibility with other datasets.

Hope that helps.

fenchri avatar Jul 21 '23 08:07 fenchri

图像我处理的数据对我来说看起来很奇怪,我不确定如何在其他模型中使用它。

请问您可以提供得到的数据集吗 我正在复现这个项目

z20021009 avatar Dec 17 '23 14:12 z20021009

Hello,

I translated your comment to "Can you provide the data set I'm reproducing this project?". As mentioned, you can edit the process.py script (https://github.com/fenchri/edge-oriented-graph/blob/master/data_processing/process.py) in order to convert the data to a different format, that is more suitable for you. The current script will produce data required for this project.

Hope this helps.

fenchri avatar Dec 18 '23 13:12 fenchri