grist-core icon indicating copy to clipboard operation
grist-core copied to clipboard

Create unique index on a column

Open remileblond67076 opened this issue 2 years ago • 17 comments

We often need to define a unique constraint on columns, in order to prohibit the creation of duplicates in a table.

This would correspond to adding a unique index to a column in the SQLLite database.

For example :

CREATE TABLE table_name(
   ...,
   column_name type UNIQUE,
   ...
);

Or

CREATE UNIQUE INDEX ux_unique_name ON table_name(column_name );

Enabling this constraint could be set using a checkbox in the column configuration interface.

remileblond67076 avatar Oct 24 '23 08:10 remileblond67076

Grist-unique

remileblond67076 avatar Oct 24 '23 08:10 remileblond67076

@remileblond67076 I have started working on this issue.

Regarding the UI:

  • Our UX expert, Aurélie, suggested that this option could be enabled for any type of columns
    • Even for lists, which would be considered as unique sets (e.g. if a cell contains A, B, we cannot fill A, B nor B, A in another cell of this column);
    • We are hesitant to place this option either near to the column name / description or with the type configuration;
  • I agree that the checkbox would be meaningful, however, I only see buttons to activate options on the column configuration panel;
  • But my attempts using the button is a bit awkward in this context, so despite the previous, I would probably go with a checkbox;

2023-10-24_10-35

fflorent avatar Oct 24 '23 08:10 fflorent

cc'ing @kbizien

fflorent avatar Oct 24 '23 08:10 fflorent

Our UX expert, Aurélie, suggested that this option could be enabled for any type of columns

Would it be for any type of column, or any type of string column, e.g. text, choice, and choice list?

Trying to think through the practical applications of uniqueness on numeric, boolean and attachment columns. Date and datetime are also unclear to me -- maybe there's a case, but usually not. Reference and reference list also become hazier.

As for design, the table tab in the creator panel has an example of checkboxes.

image

Re: uniqueness configuration design, if uniqueness were supported for all column types, it might make more sense to put the configuration in the column behavior section. The top section is about label, id and description. Adding uniqueness there is a bit surprising to me. image

If uniqueness is not supported for all columns, what do you think of placing uniqueness configuration after formatting and styling? This is currently what follows, and the space isn't used very well. Kevin might have better ideas. image

anaisconce avatar Oct 24 '23 12:10 anaisconce

Databases support unique indices on a combination of columns. For example:

CREATE UNIQUE INDEX idx_student_course
ON student_courses (student_id, course_id);

(i.e. a course enrollments table can only have a single enrollment for a particular student for a particular course).

As with SQL syntax, it makes sense to have a per-column way to say "make this column unique", but perhaps we can design it early on to support combinations of columns, so that the single-column UI is just a shortcut to creating a unique index.

Also, just wanted to add an observation: unique indices have a second benefit (besides enforcing uniqueness) of making certain operations much more efficient. So this feature will come in useful in the future when making Grist work for larger data (#43).

dsagal avatar Oct 24 '23 15:10 dsagal

Small implementation note. Grist has an obscure "on-demand table" feature that uses indexes, and some of the code involved would need a small tweak to be aware of any other indexes added. Specifically: https://github.com/gristlabs/grist-core/blob/e51c1b6b9274dbc22414824dfd823494f446b7de/app/server/lib/DocStorage.ts#L1499-L1532

So if someone makes a proof of concept and sees Grist continually deleting the index they are carefully adding, that will be why. It is no big deal, should be a small change.

paulfitz avatar Oct 24 '23 15:10 paulfitz

Would it be for any type of column, or any type of string column, e.g. text, choice, and choice list?

For all of the types of column available, including the boolean (even if the use case is probably not interesting), integers / numbers, text, choice / choice list, ... I just wonder if that would work well with attachment columns (if that's not relevant, may it also cause performance issue? IDK).

Trying to think through the practical applications of uniqueness on numeric, boolean and attachment columns. Date and datetime are also unclear to me -- maybe there's a case, but usually not. Reference and reference list also become hazier.

For references: say you use Grist as a task management for a team. You have 2 tables: People and Tasks. The team members can pick tasks by assigning themselves through the Tasks table. Also you don't want them to pick several tasks at the same time (they must unassign themselves before taking another task).

For dates, I suppose you may want to fill a calendar with one item per date.

For the other types, I admit I don't have much relevant use cases.

As for design, the table tab in the creator panel has an example of checkboxes.

Yes, I was thinking of the column config panel separately, not including the table config panel... Though I am probably overthinking about this, you are probably right and we should go with checkboxes.

I'll try to come back with some mockups soon so we can stimulate the discussion.

As with SQL syntax, it makes sense to have a per-column way to say "make this column unique", but perhaps we can design it early on to support combinations of columns, so that the single-column UI is just a shortcut to creating a unique index.

Right, when you say "perhaps we can design it early on to support combinations of columns", you suggest that technically the backend supports this combination so we pave the way for future UI enhancements?

fflorent avatar Oct 24 '23 16:10 fflorent

From my point of view, it would be great to be able to take advantage of SQLLite's uniqueness constraints. I'm not sure there's much sense in putting a uniqueness constraint on a Boolean or a list of values / references. But who can do more can do less... ;-)

remileblond67076 avatar Oct 26 '23 15:10 remileblond67076

An additional we have a user whose need is to ensure uniqueness of entries with case insensitivity for emails ([email protected] should not be added when [email protected] already exists).

fflorent avatar Jan 02 '24 17:01 fflorent

This would be great!

emanuelegissi avatar Feb 09 '24 09:02 emanuelegissi

I've made this proposition keeping the checkbox, but make it disabled when type of column is not yet selected, everything detailed here https://www.figma.com/design/wcpetFt6aOKzTszcvPPWLQ/%5B05%2F24%5D-Grist-Design?node-id=294-15720&t=9hvbpy8Fg7eoAurj-1

Image

lusebille avatar Jun 27 '24 13:06 lusebille

For UI, I'd like to come back to this wish: "design it early on to support combinations of columns, so that the single-column UI is just a shortcut to creating a unique index."

I propose the following:

  1. Implement support for unique indexes similar to SQLite's support.

  2. Introduce "UNIQUE INDEXES" section in the creator panel's "Table" tab, under "Data" subtab.

  3. Use UI similar to Sort UI (in the "Sort & Filter" subtab):

    Screenshot 2024-07-09 at 1 08 15 AM

    a. What's missing is that, unlike for Sort, we need to be able to add/remove indexes, give them names, and add/remove columns within each one. b. The options menu used for advanced sort options can be used for advanced index options, which can be added in follow-ups (e.g. case sensitivity, or treatment of empty values).

  4. For the simple (and common) case of a single-column unique constraint, a checkbox in a column's creator panel as in @lusebille's design, is mostly sufficient, and should be equivalent to creating a unique index with an auto-generated name. a. It's a heavy operation (esp. for large tables), so a simple toggle might not be the best UX. It may be better to click a link that opens a popup, asks for confirmation, and shows a spinner. b. Single-column indexes should be reflected both in "Table" tab - "Data" subtab; and in the column creator panel; regardless of how it was created.

  5. With both UI versions, if an index is created and there are duplicates in the column (or combination of columns), there should be either a question to the user, or the operation should be disallowed.

dsagal avatar Jul 09 '24 05:07 dsagal

I've made changes on figma according the specifications above ( I will present you next UX meeting )

lusebille avatar Aug 20 '24 13:08 lusebille