SMART icon indicating copy to clipboard operation
SMART copied to clipboard

Forbid duplicate data with labels from being ingested

Open schittarath3-rti opened this issue 3 years ago • 6 comments

This throws a form error is duplicate data with old labels in ingested through the update data pipeline. Does not affect project creation data upload.

Error shows which rows in the uploaded data contains this problem

schittarath3-rti avatar Jun 30 '22 16:06 schittarath3-rti

I think the part where it shows which rows are problem-causing is not user-friendly.

Currently, it shows it as a csv, but pandas has a to_html function for dataframes which could look better. It does not work well with django form errors though because it shows it has the string rather than html.

schittarath3-rti avatar Jun 30 '22 16:06 schittarath3-rti

Current error message: image

schittarath3-rti avatar Jun 30 '22 16:06 schittarath3-rti

Also what do you guys think about limiting the number of rows that is displayed? Currently, it shows all rows.

schittarath3-rti avatar Jun 30 '22 16:06 schittarath3-rti

I'd say list the first 5 or 10, then just say the number of remaining differences. The error message is a bit confusing. Are you saying the data is already there and labeled, or it's there at all?

AstridKery avatar Jul 01 '22 14:07 AstridKery

Should mean that the data is there, either unlabelled or labelled.

schittarath3-rti avatar Jul 07 '22 15:07 schittarath3-rti

@AstridKery I changed it to "The following uploaded text + metadata combinations are already in the database and the uploaded labels will not be reflected in the database:". Would take make more sense?

schittarath3-rti avatar Jul 07 '22 15:07 schittarath3-rti