timescaledb
timescaledb copied to clipboard
[Bug]: compress_chunk_time_interval does not work with BIGINT
What type of bug is this?
Configuration, Incorrect result
What subsystems and features are affected?
Compression
What happened?
if timestamp in a hypertable is a BIGINT, there is no way to use compress_chunk_time_interval https://docs.timescale.com/api/latest/compression/alter_table_compression/
I would expect to use timescaledb.compress_chunk_time_interval = BIGINT
, but it does not work.
TimescaleDB version affected
2.14.2
PostgreSQL version used
15.6
What operating system did you use?
What installation method did you use?
Not applicable
What platform did you run on?
Timescale Cloud
Relevant log output and stack trace
No response
How can we reproduce the bug?
create table if not exists public.device_metrics(
timestamp bigint,
value double precision
);
SELECT create_hypertable('public.device_metrics', 'timestamp', chunk_time_interval => 86400000000000); -- 1 day
INSERT INTO public.device_metrics(timestamp, value)
SELECT
EXTRACT(EPOCH FROM gs)*1e9::BIGINT, -- get nanoseconds
random()*100
FROM generate_series(
'2022-01-01 00:00:00'::timestamp,
'2022-06-01 00:00:00'::timestamp,
'1 minute'::interval
) AS gs;
We inserted 152 chunks:
SELECT count(*) FROM timescaledb_information.chunks
WHERE hypertable_schema = 'public' and hypertable_name = 'device_metrics';
count
-------
152
(1 row)
Prepare the table to be compressed:
CREATE OR REPLACE FUNCTION unix_nano_now() returns BIGINT LANGUAGE SQL IMMUTABLE as $$ SELECT extract(epoch from now())::BIGINT * 100000000 $$;
SELECT set_integer_now_func('public.device_metrics', 'unix_nano_now', replace_if_exists => true);
Enable compression:
1. Try to use BIGINT first:
ALTER TABLE public.device_metrics SET (timescaledb.compress,
timescaledb.compress_chunk_time_interval = 604800000000000
); -- 7 days
ERROR: invalid value for timescaledb.compress_chunk_time_interval '604800000000000'
HINT: timescaledb.compress_chunk_time_interval must be a valid interval
2. Try to use TEXT:
ALTER TABLE public.device_metrics SET (timescaledb.compress,
timescaledb.compress_chunk_time_interval = '7 days'
);
WARNING: compress chunk interval is not a multiple of chunk interval, you should use a factor of chunk interval to merge as much as possible
ALTER TABLE
SELECT
total_chunks,
number_compressed_chunks,
pg_size_pretty(before_compression_total_bytes) as before_compression,
pg_size_pretty(after_compression_total_bytes) as after_compression
FROM hypertable_compression_stats('public.device_metrics');
total_chunks | number_compressed_chunks | before_compression | after_compression
--------------+--------------------------+--------------------+-------------------
152 | 0 | |
(1 row)
Looks like it worked. Compression enabled. Now we'll compress the chunks and see what happens.
Compress all chunks manually:
DO $$
DECLARE
chunk_record text;
BEGIN
FOR chunk_record IN SELECT * FROM show_chunks('public.device_metrics')
LOOP
PERFORM public.compress_chunk(chunk_record, true);
END LOOP;
END $$;
We still have 152 compressed chunks. It should be 7 times less
SELECT
total_chunks,
number_compressed_chunks,
pg_size_pretty(before_compression_total_bytes) as before_compression,
pg_size_pretty(after_compression_total_bytes) as after_compression
FROM hypertable_compression_stats('public.device_metrics');
total_chunks | number_compressed_chunks | before_compression | after_compression
--------------+--------------------------+--------------------+-------------------
152 | 152 | 25 MB | 12 MB
(1 row)
Since in the doc https://docs.timescale.com/api/latest/compression/alter_table_compression/, there are no examples with BIGINT, I assume it doesn't work with this type.
Yeah, I guess it's not possible to configure normally now, because the compress_chunk_time_interval
is always Interval, and for the integer columns it is converted to microseconds, but they are using the nanosecond units, so this is where the mismatch comes from. I guess that to fix it, we could treat it same as we do chunk_time_interval
which is a bigint.
For the time being, I have a dumb workaround:
ALTER TABLE public.device_metrics SET (timescaledb.compress,
timescaledb.compress_chunk_time_interval = '7000 days'
);
Just increase the interval 1000x, so that when it's converted to microseconds, you get the desired value in nanoseconds.
This should be at least doccumented somewhere. We tried to implement timescaledb + timestamp9 to work with nanosecond datasources and spent tons of time banging our heads against the wall until we found out about this issue.
Like the people above me said, turns out we had to multiply everything by * 1000 otherwise the db crashes or gets extremely slow, I'm talking about create_hypertable
, add_compression_policy
and compress_chunk_time_interval