timescaledb
timescaledb copied to clipboard
[Bug]: Self-reference check to add_data_node does not work. Deadlock still happens.
What type of bug is this?
Crash, Locking issue
What subsystems and features are affected?
Access node, Multi-node
What happened?
If add_data_node
is attempted with the same instance and database as
the one the add_data_node
is executed on, it will deadlock since a
transaction is opened on the datanode which will block updates.
This is in reference to Add self-reference check to add_data_node #2144
While we noticed that the issue has been noticed and changes were commited to main branch two years ago - we still ran into the same issue after accidentally entering the access node hostname/address when running add_data_node()
Had to forcefully restart the docker container for the access node to continue working.
TimescaleDB version affected
2.7.0
PostgreSQL version used
14.3
What operating system did you use?
Debian GNU/Linux 11 (bullseye) Running Docker with Image: timescale/timescaledb:latest-pg14
What installation method did you use?
Docker
What platform did you run on?
On prem/Self-hosted
Relevant log output and stack trace
No response
How can we reproduce the bug?
Connect to any database with psql with postgres user.
Try to add a data node by using the access node hostname and port:
SELECT add_data_node('any-name','localhost');
EDIT: localhost actually produces the expected errors. Issue is when using an FQDN as per example:
postgres=# SELECT add_data_node('selfreferencetest','localhost');
NOTICE: database "postgres" already exists on data node, skipping
NOTICE: extension "timescaledb" already exists on data node, skipping
DETAIL: TimescaleDB extension version on localhost:5432 was 2.7.0.
ERROR: cannot add "selfreferencetest" as a data node
DETAIL: ERROR: node is already an access node
postgres=# SELECT add_data_node('selfreferencetest','<FQDN DNS hostname here>');
^CCancel request sent
^CCancel request sent
^CCancel request sent
Hi @Sidicer thank you for reaching out. Attempting to reproduce this issue gave the expected errors and not a deadlock. Could you please provide the specific steps that you followed and ran into this? Thanks!
@konskov Thank you for a quick reply. I mentioned "localhost" in the issue, which does not replicate the issue. I will update the original post with this additional information.
When using FQDN - it fails and deadlocks.
postgres=# SELECT add_data_node('selfreferencetest','localhost');
NOTICE: database "postgres" already exists on data node, skipping
NOTICE: extension "timescaledb" already exists on data node, skipping
DETAIL: TimescaleDB extension version on localhost:5432 was 2.7.0.
ERROR: cannot add "selfreferencetest" as a data node
DETAIL: ERROR: node is already an access node
postgres=# SELECT add_data_node('selfreferencetest','<FQDN DNS hostname here>');
^CCancel request sent
^CCancel request sent
^CCancel request sent
@Sidicer note that no NOTICE messages have been omitted yet in the FQDN case. That suggests that there are connectivity issues while using FQDN. Can you please ping
the FQDN from the access node and see if it works ok enough?
Hello @Sidicer,
I tried to reproduce the problem on my system. Unfortunately, I was not able to reproduce the deadlock so far. I called add_data_node
to add localhost
, 127.0.0.1
, and <FQDN>
as data nodes. In all three cases, an error message was returned.
Might it be possible to check the DNS setup and the network connectivity by using ping
as mentioned by @nikkhils ?
Test case - using localhost
test2=# SELECT add_data_node('any-name','localhost');
NOTICE: database "test2" already exists on data node, skipping
NOTICE: extension "timescaledb" already exists on data node, skipping
DETAIL: TimescaleDB extension version on localhost:5432 was 2.7.0.
ERROR: [any-name]: cannot add the current database as a data node to itself
DETAIL: Adding the current database as a data node to itself would create a cycle. Use a different instance or database for the data node.
HINT: Check that the 'port' parameter refers to a different instance or that the 'database' parameter refers to a different database.
Test case - using localhost IP
test2=# SELECT add_data_node('any-name','127.0.0.1');
NOTICE: database "test2" already exists on data node, skipping
NOTICE: extension "timescaledb" already exists on data node, skipping
DETAIL: TimescaleDB extension version on 127.0.0.1:5432 was 2.7.0.
ERROR: [any-name]: cannot add the current database as a data node to itself
DETAIL: Adding the current database as a data node to itself would create a cycle. Use a different instance or database for the data node.
HINT: Check that the 'port' parameter refers to a different instance or that the 'database' parameter refers to a different database.
Test case - using FQDN
test2=# SELECT add_data_node('any-name','debian11-work.home.local');
NOTICE: database "test2" already exists on data node, skipping
NOTICE: extension "timescaledb" already exists on data node, skipping
DETAIL: TimescaleDB extension version on debian11-work.home.local:5432 was 2.7.0.
ERROR: [any-name]: cannot add the current database as a data node to itself
DETAIL: Adding the current database as a data node to itself would create a cycle. Use a different instance or database for the data node.
HINT: Check that the 'port' parameter refers to a different instance or that the 'database' parameter refers to a different database.
This issue has been automatically marked as stale due to lack of activity. You can remove the stale label or comment. Otherwise, this issue will be closed in 30 days. Thank you!
Dear Author,
We are closing this issue due to lack of activity. Feel free to add a comment to this issue if you can provide more information and we will re-open it. Thank you!