seek icon indicating copy to clipboard operation
seek copied to clipboard

Detect duplicate samples upon file upload

Open vdkkia opened this issue 3 years ago • 1 comments

  • Add the ability to detect duplicate samples within a project on upload in case all sample_attributes are the same.
  • Let the user select what to do with the duplicate samples (skip, update)

vdkkia avatar May 11 '21 12:05 vdkkia

the duplication would be within the context of the sample type and project

stuzart avatar May 27 '21 13:05 stuzart

As part of Samples WG:

Scenario assumes a generally accepted “ID”. If it is a new uuid or the classical SEEK ID it is independent of the scenario. Suggestion: Optimally the ID / UUID column in the excel is locked and should not be allowed to be changed.

  1. Sample type with 1 Sample (Sample ID is 1) exists.

  2. Spreadsheet containing Sample of ID=1 is uploaded to the instance for Sample extraction. Spreadsheet has a column with the ID, for this example the ID value is 1

    • Spreadsheet is either downloaded from the instance (not possible currently, as a new feature for this is needed), or just a continuation of the offline filling of the downloaded spreadsheet by the User.
  3. Options

    1. ID column not changed. A row with ID=1 exists → Sample exists! To allow for updating of Sample of ID=1, check attributes values: All values are the same: SEEK does nothing At least one attribute value changed (not the ID column): Ask the user if Sample ID=1 should be updated. A batch operation, all changes or no changes.

    2. There is no value for ID → Sample does not exist: Create Sample To avoid (erroneous) duplication of sample information by the User, an extra step can be added to check if that combination of values already exists in another registered Sample of that Sample Type Yes: “This Sample might already exist, are you sure you want to create it again?” No: Create Sample

    3. There is a value for ID that does not exist in the database → Sample does not exist. Throw error “This Sample does not exist”. An User is not allowed to set a desired value for ID

Samples should be unique in a Sample Type in all items inside a Study.

rabuono avatar Apr 11 '23 14:04 rabuono

Testing document: Samples download and upload - DataHub

floradanna avatar Aug 08 '23 13:08 floradanna