timescaledb icon indicating copy to clipboard operation
timescaledb copied to clipboard

GROUP BY error when setting compress_segmentby with an enum column

Open zhihanyue opened this issue 3 years ago • 3 comments

Relevant system information:

  • OS: Ubuntu 20.04
  • PostgreSQL version: 13.3
  • TimescaleDB version: 2.4.0
  • Installation method: apt install

Describe the bug

When setting compress_segmentby with an enum column enum_col, a query with GROUP BY enum_col raises an error.

To Reproduce

  1. Create hypertable and insert data
CREATE TYPE an_enum_type AS ENUM ('home', 'school');

CREATE TABLE test (
	time timestamp NOT NULL,
	enum_col an_enum_type NOT NULL,
	x double precision
);

SELECT create_hypertable(
    'test', 'time',
    chunk_time_interval => INTERVAL '1 year'
);
INSERT INTO test VALUES ('2001-01-01 00:00', 'home', 1), ('2001-01-01 01:00', 'school', 2), ('2001-01-01 02:00', 'home', 3);

SELECT enum_col FROM test GROUP BY enum_col;  -- Here, GROUP BY enum_col works
  1. Enable compression (segment by with the enum_col)
ALTER TABLE test SET (
	timescaledb.compress,
	timescaledb.compress_segmentby = 'enum_col',
	timescaledb.compress_orderby = 'time'
);
SELECT add_compression_policy('test', INTERVAL '1 years');
  1. Wait for the compression job to be done
  2. GROUP BY enum column
SELECT enum_col FROM test GROUP BY enum_col;
-- ERROR:  operator 0 is not a valid ordering operator
-- SQL state: XX000

Expected behavior No error

Actual behavior Error

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

zhihanyue avatar Aug 14 '21 17:08 zhihanyue

We just encountered the same issue. This bug is quite annoying and limiting. Not sure why it is categorized as a feature request.

Also, the error message is confusing at best. If this is truly expected behavior a better error message and a warning in the documentation to not use enum columns in compress_segmentby would be in order.

mwalser avatar Dec 22 '21 08:12 mwalser

I also stumbled across this and agree with mwalser the error message was very irritating. Please mention it in the docs or better fix it 😄

This also happens when the enum column is involved in DISTINCT statements for example: SELECT DISTINCT enum_col FROM test;

ydylla avatar Dec 22 '21 21:12 ydylla

select create_hypertable('device_stat', 'ts', 'symbol', 2, chunk_time_interval => interval '4 hour');

-- compression policies
alter table device_stat set ( timescaledb.compress, timescaledb.compress_segmentby = 'symbol, vendor_id');
select add_compression_policy('device_stat', interval '4 hour');

I initially had configured my db like so. ~I wonder this happens when the chunk time interval and recompression interval are close to each other~? Either way this seems like a bug than a feature request.

EDIT: Workaround was to decompress and disable compression for chunks

Swoorup avatar Jul 04 '22 04:07 Swoorup