jabref icon indicating copy to clipboard operation
jabref copied to clipboard

Duplicate check during import marks articles from collection as possible duplicates

Open jorgman1 opened this issue 2 years ago • 1 comments

JabRef version

Latest development branch build (please note build date below)

Operating system

GNU / Linux

Details on version and operating system

JabRef 5.7--2022-06-01--943489e || Linux 5.10.0-14-amd64 amd64 || Java 18.0.1 || JavaFX unknown

Checked with the latest development build

  • [X] I made a backup of my libraries before testing the latest development version.
  • [X] I have tested the latest development version and the problem persists

Steps to reproduce the behaviour

  1. Copy two articles (InCollection) from the same book to a new database
  2. The second entry that is copied is marked as possible duplicate although author, title, and citationkey deviate.

Appendix

Import_erroneous_duplicates

jorgman1 avatar Jun 02 '22 13:06 jorgman1

I think that I experienced a similar problem in the past.

claell avatar Jun 13 '22 16:06 claell

Can you please test with the latest development version? We recently improved the merging and duplicate detection handling.

We would like to ask you to use a development build from https://builds.jabref.org/main and report back if it works for you. Please remember to make a backup of your library before trying-out this version.

Siedlerchr avatar Apr 16 '23 17:04 Siedlerchr

I still have the problem with the latest development version.

JabRef 5.10--2023-04-16--d47ed31 Linux 4.4.0-53-generic amd64 Java 19.0.2 JavaFX 20+19

Screenshot 20230417_Screenshot

jorgman1 avatar Apr 17 '23 14:04 jorgman1

Would you mind sharing the bib entries for testing?

Siedlerchr avatar Apr 17 '23 15:04 Siedlerchr

I think I found a mistake in my database. I only had the book's ISBN in the entries. Hence, all entries had the same ISBN. Removing the ISBN from the entries solves the problem. Moreover, after putting the DOIs for each entry, they are not recognized as duplicates anymore (only if both entries have DOIs, if one does and the other not, they are recognized as duplicates (due to the ISBN?)).

Hence, I think it is my mistake. I don't think it is correct to have all entries with the same ISBN, which refers to the whole book.

These are five entries: Collection_test.txt

jorgman1 avatar Apr 17 '23 16:04 jorgman1

You could handle such cases with entry links https://docs.jabref.org/advanced/entryeditor/entrylinks e.g. you have one entry book and others with inbook and then you can put the isbn in the book entry and it will show up in the references for each inbook entry as well. Then you avoid duplicate information

Siedlerchr avatar Apr 17 '23 17:04 Siedlerchr

I think I found a mistake in my database. I only had the book's ISBN in the entries.

Thank you for the hint. IMHO this is a bug in the duplicate detection. I "unzipped" the explanation at https://github.com/JabRef/jabref/pull/9769#issuecomment-1512610441.

Hence, I think it is my mistake. I don't think it is correct to have all entries with the same ISBN, which refers to the whole book.

This is what ISBNs are used for. Depending on the bibtex style required by the publisher, either the ISBN or the DOI is printed. Maybe both. In the case only the ISBN is printed, the information is IMHO useful to find the entry.

koppor avatar Apr 18 '23 07:04 koppor

@AbdAlRahmanGad Would it be theoretically and technically possible to only remove the ISBN duplicate detection for InBook or InCollection in your pull-request?

ThiloteE avatar Apr 16 '24 19:04 ThiloteE

@AbdAlRahmanGad Would it be theoretically and technically possible to only remove the ISBN duplicate detection for InBook or InCollection in your pull-request?

I think it would be possible.

AbdAlRahmanGad avatar Apr 16 '24 19:04 AbdAlRahmanGad

@AbdAlRahmanGad For that, add a test case and then with trial and error fix the code ^^

koppor avatar Apr 16 '24 21:04 koppor