timescaledb icon indicating copy to clipboard operation
timescaledb copied to clipboard

Support unique indexes without partitioning column

Open thaxy opened this issue 8 years ago • 4 comments

I am currently successfully using timescaledb with the following schema:

         Column         |            Type             | Collation | Nullable |    Default    
------------------------+-----------------------------+-----------+----------+---------------
 station_id             | integer                     |           | not null | 
 timestamp              | timestamp without time zone |           | not null | 
 s0                     | integer                     |           |          | 
 s1                     | integer                     |           |          | 

Indexes:
    "measurements_station_id_timestamp_idx" btree (station_id, "timestamp" DESC)

I need to add a new column with the name station_measurement_id (of the type bigint) to the table. Together with the station_id it would be a primary key for a single measurement.

Since my application needs to query for records by:

  • station_id and timestamp
  • station_id and station_measurement_id

I wanted to add an index to the table with: ALTER TABLE measurements ADD PRIMARY KEY (station_id, station_measurement_id); Which results in the following error: ERROR: cannot create a unique index without the column "timestamp" (used in partitioning)


I have looked at the docs and found the arguments partitioning_column and number_partitions but couldn't find more detail about them. is it somehow possible to realize my intention? I would be grateful for any hint. I read in in a similar issue that timescaledb doesn't maintain global uniqueness over the chunks. I just want to make sure that there is no other way of speeding up such a query before working on a different approach.

thaxy avatar Mar 28 '18 09:03 thaxy

So is the data really unique on (station_id, station_measurement_id)? i.e. There cannot be two entries with the same station_id, station_measurement_id but different timestamps?

I suspect that what's really unique is (timestamp, station_id, station_measurement_id). If that's the case you can do: ALTER TABLE measurements ADD PRIMARY KEY (station_id, station_measurement_id, timestamp desc);.

You can also add a regular (non-primary key) index on just (station_id, station_measurement_id).

cevian avatar Mar 28 '18 15:03 cevian

Sorry for the late reply. It shouldn't be possible to add more than one entry with the same station_id and station_measurement_id no matter if the timestamp matches or not.

To speed up the quries mentioned above I have added the following index:

  • "i1" btree (station_id, station_measurement_id, "timestamp" DESC)

I verified the execution plans used index scans with EXPLAIN ANALYZE. As you can see in my original post I tried to achieve the uniqueness with ALTER TABLE measurements ADD PRIMARY KEY (station_id, station_measurement_id); but ran into the error mentioned above. Is there a way to achieve uniqueness in this case with timescaledb?

thaxy avatar Apr 09 '18 08:04 thaxy

There is not currently a way to achieve uniqueness in this case, as we need the time column included in the index in order to enforce constraints between chunks. I'm going to mark it as an enhancement that we can add though because I do understand the importance of the use case.

davidkohn88 avatar May 14 '18 15:05 davidkohn88

CREATE TABLE IF NOT EXISTS "data" (
  "id" UUID NOT NULL,
  "sourceId" UUID NOT NULL REFERENCES "sources" ("id") ON DELETE CASCADE ON UPDATE CASCADE,
  "createdAt" TIMESTAMP WITH TIME ZONE NOT NULL,
  "updatedAt" TIMESTAMP WITH TIME ZONE NOT NULL,
  "geometry" GEOGRAPHY,
  "data" JSONB NOT NULL,
  "boundaryIds" TEXT[] NOT NULL DEFAULT ARRAY[]::TEXT[],
  PRIMARY KEY ("sourceId", ("data"->>'id'))
);

SELECT create_hypertable('data', 'createdAt', 'sourceId', chunk_target_size => '1GB');

Currently yields cannot create a unique index without the column "createdAt" (used in partitioning) but I think timescale should be able to support this case just fine (uniqueness is in each partition by data id, not globally).

If the secondary partition attribute is in the primary key, is it possible to lift the restriction?

yocontra avatar Nov 14 '18 17:11 yocontra