timescaledb icon indicating copy to clipboard operation
timescaledb copied to clipboard

[Bug]: could not find pathkey item to sort

Open bobozaur opened this issue 5 months ago • 1 comments

What type of bug is this?

Data corruption, Unexpected error

What subsystems and features are affected?

Compression, Policy

What happened?

I have two hypertables which we used to simply query like this:

SELECT * FROM my_table sa
INNER JOIN my_other_table sjr ON sa.id = sjr.other_id 
ORDER BY sjr.registered_on ASC, sjr.id ASC
LIMIT 50;

I encountered an issue when running this query that said could not find pathkey item to sort. The error message from the database framework I use also contained: file: Some(\"createplan.c\"), line: Some(6235), routine: Some(\"prepare_sort_from_pathkeys\").

Given the createplan.c and knowing this is a prepared statement (the actual query has arguments) I tried forcing the usage of a custom execution plan because the issue only happened with the generic execution plan after 5 runs of the prepared statement. While that seemed to fix the issue for a while it just was not something I could afford for this query because it gets run a lot and the only sensible thing to do is to force usage of the custom execution plan for this query alone in a transaction, but that implies multiple roundtrips to the database which hurts performance.

The error message also pointed me towards a sorting issue. Another successful way of dealing with this was setting enable_sort = off as a database parameter. But I was concerned with the performance issues that might arise from having this set at the database level as well, and while I could use transactions I'd run into the same concern as with forcing the usage of a custom execution plan.

I found this issue https://github.com/timescale/timescaledb/issues/2232#issuecomment-1855226609 which claimed that changing column datatypes from SMALLINT to INTEGER solved this issue. I did this and it temporarily solved it, but the issue would happen again in a couple of minutes.

I started looking into the latest database migrations searching for what might've changed regarding this hypertable and a couple of columns were added, a custom Postgres enum was renamed and some of its values renamed while some were added and compression on the hypertable was enabled. I disabled the compression policy on a whim and decompressed already compressed chunks and that solved the issue completely without other parameters being set on the database.

This is certainly regarding data, because before getting to these solutions I encountered this in a development database that I could dispose of. So I did that and got the database schema up to speed through the exact same database migrations, and I don't experience the issue there anymore. On other databases I can't just ditch the data, but hoped that the issue would be isolated to the development database. It unfortunately was not. Of note is that upgrading the database to the latest version did not help either.

I'd like to be able to enable compression on the hypertable again but we don't know what to do so that the queries still work. I tried re-creating the table and migrating all the data after making it a hypertable, even re-creating all the custom types in the database in the process but nothing works.

TimescaleDB version affected

2.16.1

PostgreSQL version used

16

What operating system did you use?

Linux X86_64

What installation method did you use?

Docker

What platform did you run on?

Amazon Web Services (AWS)

Relevant log output and stack trace

No response

How can we reproduce the bug?

I tried to create a minimal reproducible example but failed to do so. This seems to be tightly related to the data and data changes.

bobozaur avatar Sep 24 '24 08:09 bobozaur