JedAIToolkit icon indicating copy to clipboard operation
JedAIToolkit copied to clipboard

Null pointer when trying to load data using latest release

Open yeikel opened this issue 5 years ago • 6 comments

I am using the following release

And I am trying the jedaiDesktopApp-1.1.jar with the following datasets (from the samples) :

abtBuyIdDuplicates (for D1) abtBuyProfiles (for truth file)

image

But I get the following error :

image

I tried with CSV files and I also get the same error

yeikel avatar Dec 11 '18 21:12 yeikel

Hi! The serialized datasets are incompatible with older versions, due to a change in some Java classes. Try using the latest version and let me know if the problem is fixed.

gpapadis avatar Dec 14 '18 03:12 gpapadis

Hi! I am using the latest release.

I also tried csv files and I received the same errors

yeikel avatar Dec 20 '18 16:12 yeikel

Hello yeikel,

Can you please tell me which CSV files did you try exactly, so I can test it?

Anyhow, the latest release we have on Github right now (the one you linked above) is not the latest version of the code, so this is why @gpapadis refers to it as an older version.

The current version of the code is still a work in progress, which is why we haven't put up a full "release" on Github yet. However, you can find a build of it here: https://drive.google.com/open?id=1W-ffcQZWnw0MIWluaBzyApsa7nqq5wWB, or build it yourself from the repositories.

This version should read the serialized files directly, and it also allows you to configure some options for how to read the CSV, such as the delimiter, which could fix the problem you are having with the CSV files too.

leots avatar Dec 24 '18 19:12 leots

@leots Where can I find sample CSV files? And their format?

Unless I am using the wrong files/configuration , I tried the serialized samples included in the documentation but they fail. :

image image

yeikel avatar Dec 30 '18 19:12 yeikel

The problem with the serialized datasets is that you use the groundtruth file (abtBuyIdDuplicates) in the place of "Entity Profiles D1" and the profiles size as the "Ground-truth file". It should be the other way round. CSV datasets are available here: https://dbs.uni-leipzig.de/en/research/projects/object_matching/fever/benchmark_datasets_for_entity_resolution

gpapadis avatar Dec 31 '18 05:12 gpapadis

One more thing that I can see, is that you are selecting clean-clean entity resolution but haven't selected a 2nd entity profiles dataset, so make sure you either select dirty ER or add a 2nd dataset.

leots avatar Dec 31 '18 08:12 leots