timescaledb icon indicating copy to clipboard operation
timescaledb copied to clipboard

[Bug]: A memory leak with UPDATE of older rows

Open akamensky opened this issue 2 years ago • 19 comments

What type of bug is this?

Other

What subsystems and features are affected?

Other

What happened?

I am running a process that updates old rows in background (performance of that is not very important). The memory of postgres connection process (not any of background workers) keeps growing over time. Leading to a (presumably) OOM. Presumably is because I cannot find any entry in any logs as to why the container was stopped. But look at a graph below I believe it relates to that:

Screen Shot 2021-11-29 at 1 06 02 PM

TimescaleDB version affected

2.5.0

PostgreSQL version used

14

What operating system did you use?

Fedora 35

What installation method did you use?

Docker

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

None

How can we reproduce the bug?

Not sure, but my query to update is:

update outputs set col_3 = '%s' where (col_1, col_2) in ((..., ...), (..., ...));

col_1 and col_3 is text and col_2 is integer.

Multiple queries are bundled into a single transaction and transactions are executed in rolling fashion (such as next one starts only after previous is done). Individual queries can be massive, however (on order of 100s if not 1000s of tuples in IN (...) clause.

akamensky avatar Nov 29 '21 05:11 akamensky

@akamensky Is the memory released after a transaction finishes?

erimatnor avatar Nov 29 '21 10:11 erimatnor

@erimatnor i guess my description is not clear enough. No. It doesn't release memory it looks. At least not all of it. It keeps building up. Over time. It completed about 200k transactions until it crashed.

akamensky avatar Nov 29 '21 11:11 akamensky

@akamensky AFAIK, we're not doing anything special with UPDATEs, so it should be standard PostgreSQL functionality. But it requires further investigation. Is this something you've started to experience recently, for instance, after upgrading to PostgreSQL 14?

erimatnor avatar Nov 30 '21 08:11 erimatnor

Thanks for followup @erimatnor. This is something I am running myself in my home lab (outside of other tickets I raised). So it was postgres 14 and timescale 2.5.0 from the start (all in Docker).

I've been running this system for awhile without issue just doing inserts. However i realized I needed to populate another column based on processing of data already in the table. So in this case I did alter table tbl_name add column col_name text null, and started process that updates the data, which started causing these crashes (so far had 5-6 of them), though it is not terribly bad for my case (since transactions, so data should hopefully stay consistent and updating app will detect the crash and re-start updates from last committed change).

I've noticed a few points about database behavior when this issue is happening:

  • memory usage goes into RSS of the postgres child process that is spawned for a connection (not any of the background workers)
  • If I stop or kill updating process (the process that connects to DB and runs the update transactions), the postgres child process hangs around for awhile (i observed ~10 minutes) and still consumes ton of memory
  • If while the updating process is running I start a select query and then try to cancel it -- it hangs indefinitely (i waited couple of hours, but it did not do anything). if I stop updates, it is still the same, I actually have to restart postgres to be able to make large selects

It could be also something about postgres changes in 14. But I did not run same logic on other versions, so cannot compare really.

PS: If that matters, I am using 14 Docker tag, which makes it Alpine (and hence musl and not glibc), not sure if related to that. I could try switching to Debian based images and see if that is the same there.

akamensky avatar Nov 30 '21 08:11 akamensky

I am able to reproduce the issue with the following set of commands,

Create extension timescaledb;

Create table test_mem_leak(col_1 text, col_2 int);

Select create_hypertable('test_mem_leak', 'col_2', chunk_time_interval =>10000) ;

Insert into test_mem_leak values (substr(md5(random()::text), 0, 25),generate_series(1, 10000));
Insert into test_mem_leak values (substr(md5(random()::text), 0, 25),generate_series(10000, 899999));
Insert into test_mem_leak values (substr(md5(random()::text), 0, 25),generate_series(899999, 8999999));

Alter table test_mem_leak add column col_3 text;

update test_mem_leak  set col_3 = 'substr(md5(random()::text), 0, 25)' where (col_2) in ((1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18), (19), (20), (317), (138), (139), (140),(21), (32), (53), (74), (15), (56), (127), (228), (329), (430), (931), (132), (133), (341), (351), (316), (317), (138), (139), (140), (33), (34), (35), (36), (37), (38), (39), (40),(21), (32), (53), (74), (15), (56), (127), (228), (329), (430), (931), (132), (133), (341), (351), (316), (317), (138), (139),  (14), (15), (16), (17), (18), (19), (20), (317), (138), (139), (140),(21), (32), (53), (74), (15), (56), (127), (2258), (3529), (4530), (9321), (1132), (1133), (1341), (3151), (3116), (1317), (1138), (1139), (1140), (133), (134), (135), (316), (3517), (5318), (5319), (4510),(5211), (3512), (51435),(175), (185), (195), (205), (3175), (1385), (1395), (1450),(215), (352), (53), (74), (15), (556), (1257), (2258), (3259), (4530), (9531), (5132), (5133), (5341), (318), (319), (320), (3317), (3138), (1339), (1340),(231), (332), (353), (374), (135), (356), (3127), (2328), (3329), (4330), (3931), (1332), (4133), (4341), (4351), (4316), (3174), (1384), (1394), (1404), (334), (344), (345), (346), (347), (348), (439), (404),(261), (632), (653), (764), (165), (656), (1627), (2628), (3269), (4360), (6931), (1632), (1633), (3641), (3651), (3616), (3617), (1638), (1639),  (614), (615), (166), (167), (168), (196), (206), (3176), (1368), (1369), (1460),(216), (362), (573), (774), (175), (756), (1727), (22758), (35729), (45730), (79321), (17132), (11373), (13471), (31751), (37116), (71317), (11378), (11379), (17140), (1733), (1734), (1375), (3176), (37517), (53718), (53719), (45107),(52171), (37512), (751435),(175), (7185), (1795), (2705), (73175), (13875), (13795), (14570),(2715), (3572), (753), (774), (715), (5756), (17257), (22758), (73259), (74530), (95731), (57132), (51733), (57341), (15), (16), (17), (18), (19), (20), (317), (138), (139), (140),(21), (32), (53), (74), (15), (56), (127), (2258), (3529), (4530), (9321), (1132), (1133), (1341), (3151), (3116), (1317), (1138), (1139), (1140), (133), (134), (135), (316), (3517), (5318), (5319), (4510),(5211), (3512), (51435),(175), (185), (195), (205), (3175), (1385), (1395), (1450),(215), (352), (53), (74), (15), (556), (1257), (2258), (3259), (4530), (9531), (5132), (5133), (5341), (318), (319), (320), (3317), (3138), (1339), (1340),(231), (332), (353), (374), (135), (356), (3127), (2328), (3329), (4330), (3931), (1332), (4133), (4341), (4351), (4316), (3174), (1384), (1394), (1404), (334), (344), (345), (346), (347), (348), (439), (404),(261), (632), (653), (764), (165), (656), (1627), (2628), (3269), (4360), (6931), (1632), (1633), (3641), (3651), (3616), (3617), (1638), (1639),  (614), (615), (166), (167), (168), (196), (206), (3176), (1368), (1369), (1460),(216), (362), (573), (774), (175), (756), (1727), (22758), (35729), (45730), (79321), (17132), (11373), (13471), (31751), (37116), (71317), (11378), (11379), (17140), (1733), (1734), (1375), (3176), (37517), (53718), (53719), (45107),(52171), (37512), (751435),(175), (7185), (1795), (2705), (73175), (13875), (13795), (14570),(2715), (3572), (753), (774), (715), (5756), (17257), (22758), (73259), (74530), (95731), (57132), (51733), (57341));

Just looking at the activity monitor, it shows quite a memory swell and the process never releases the memory. However I tried the same with a non distribute table and this wasn't the case there.

My next steps are,

  1. Check the behaviour with the partitioned table in postgres to find out if the issue is with postgres handling of partitions tables.
  2. Increase the chunk size and reduce the total number of chunks to find out if number of chunks is the issue.

RafiaSabih avatar Dec 10 '21 12:12 RafiaSabih

Update:

  1. When I increased the chunk size so that only a few big chunks were there, the memory was NOT consumed significantly.
  2. When I created normal partitioned tables with postgresql declarative partitioning, memory consumed was significant in that case also.

From my understanding, it looks the issue is with partitioned tables in postgresql when the number of partitions is large.

RafiaSabih avatar Dec 15 '21 15:12 RafiaSabih

@RafiaSabih

when the number of partitions is large

In my case the number of partitions (chunks) is around 20. But each chunk is quite large (represents 6 month worth of data) with largest chunk (total bytes including indexes) about 12 GB.

The choice for this partitioning is so that the indexes nicely fit into memory and thus querying is rather fast.

akamensky avatar Dec 15 '21 15:12 akamensky

@akamensky are you sure that you are facing OOM due to this database behavior on your setup? This looks like standard OS behavior in terms of caching.

nikkhils avatar Dec 16 '21 11:12 nikkhils

@akamensky will it be possible to share your schema/data and the update queries to help us reproduce this locally?

nikkhils avatar Dec 16 '21 12:12 nikkhils

are you sure that you are facing OOM due to this database behavior on your setup? This looks like standard OS behavior in terms of caching.

As a Linux admin of almost 15 years I can assure you it is not standard OS behavior in terms of caching. That's an OOM, the cause is growth of RSS of a specific single process under above described conditions :)

will it be possible to share your schema/data and the update queries to help us reproduce this locally?

Sure, I can share schema and hypertable configuration, not sure if I can share 2TB of the dataset though. I can see if I can provide a script to generate data that would reproduce this behavior tho, that part should be easy.

akamensky avatar Dec 16 '21 14:12 akamensky

I tried reproducing this bug with lesser number of chunks, but unfortunately I found no oom or other issues. My table schema and size is as follows,

     chunk_schema      |      chunk_name      | table_bytes | index_bytes | toast_bytes | total_bytes | node_name 
-----------------------+----------------------+-------------+-------------+-------------+-------------+-----------
 _timescaledb_internal | _hyper_4_92945_chunk |      245760 |       65536 |   778543104 |   778854400 | 
 _timescaledb_internal | _hyper_4_92946_chunk |      909312 |      155648 |  1868439552 |  1869504512 | 
 _timescaledb_internal | _hyper_4_92947_chunk |     1671168 |      237568 |  1868439552 |  1870348288 | 
 _timescaledb_internal | _hyper_4_92948_chunk |     2187264 |      335872 |  1863254016 |  1865777152 | 
 _timescaledb_internal | _hyper_4_92949_chunk |     2539520 |      417792 |  1868439552 |  1871396864 | 
 _timescaledb_internal | _hyper_4_92950_chunk |     2727936 |      417792 |  1868439552 |  1871585280 | 
 _timescaledb_internal | _hyper_4_92951_chunk |     2916352 |      385024 |  1863254016 |  1866555392 | 
 _timescaledb_internal | _hyper_4_92952_chunk |     2859008 |      335872 |  1868439552 |  1871634432 | 
 _timescaledb_internal | _hyper_4_92953_chunk |     2465792 |      303104 |  1868439552 |  1871208448 | 
 _timescaledb_internal | _hyper_4_92954_chunk |     1810432 |      253952 |  1868439552 |  1870503936 | 
 _timescaledb_internal | _hyper_4_92955_chunk |     1024000 |      172032 |  1868439552 |  1869635584 | 
 _timescaledb_internal | _hyper_4_92956_chunk |      466944 |       90112 |  1868439552 |  1868996608 | 
 _timescaledb_internal | _hyper_4_92957_chunk |       40960 |       16384 |  1411727360 |  1411784704 | 
(13 rows)
 Table "public.test_mem_leak"
 Column | Type | Collation | Nullable | Default | Storage  | Compression | Stats target | Description 
--------+------+-----------+----------+---------+----------+-------------+--------------+-------------
 col1   | text |           |          |         | extended |             |              | 
 col2   | date |           | not null |         | plain    |             |              | 
 col3   | text |           |          |         | extended |             |              | 
Indexes:
    "test_mem_leak_col2_idx" btree (col2 DESC)
Triggers:
    ts_insert_blocker BEFORE INSERT ON test_mem_leak FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker()
Child tables: _timescaledb_internal._hyper_4_92945_chunk,
              _timescaledb_internal._hyper_4_92946_chunk,
              _timescaledb_internal._hyper_4_92947_chunk,
              _timescaledb_internal._hyper_4_92948_chunk,
              _timescaledb_internal._hyper_4_92949_chunk,
              _timescaledb_internal._hyper_4_92950_chunk,
              _timescaledb_internal._hyper_4_92951_chunk,
              _timescaledb_internal._hyper_4_92952_chunk,
              _timescaledb_internal._hyper_4_92953_chunk,
              _timescaledb_internal._hyper_4_92954_chunk,
              _timescaledb_internal._hyper_4_92955_chunk,
              _timescaledb_internal._hyper_4_92956_chunk,
              _timescaledb_internal._hyper_4_92957_chunk
Access method: heap

I am attaching a script which I used to generate multiple queries and run them in different transactions.

mem_leak.txt

I think it would be good if you may provide more details on how to reproduce this issue, then we can investigate this further.

RafiaSabih avatar Dec 20 '21 10:12 RafiaSabih

I got rid of the UPDATE statements and it only running INSERT and SELECT now. There is still RSS usage increase over time happening, but at much slower rate. It appears that only connections doing SELECT have their RSS increase like that, connections doing INSERT are still minimal RSS:

Screen Shot 2022-01-16 at 11 23 00 AM Screen Shot 2022-01-16 at 11 23 13 AM

At the same time a host running a standalone Postgres14 (no timescale extension) running UPDATE at much higher frequency has no RSS increase whatsoever:

Screen Shot 2022-01-16 at 11 30 02 AM

I have not gotten around to writing some code to reproduce this issue yet. But should be able to this weekend.

akamensky avatar Jan 16 '22 03:01 akamensky

@RafiaSabih I went over the script -- it won't see any memory increase. Because it open a new connection each time. Try using a persistent connection (like most real life software do) for several days doing hundreds/thousands updates per second. As per screenshot I posted above, the process (of postgres) that sees its RSS grow is the child process forked for a new client connection.

To recap my observations (with current setup):

In case of the software that I am running it is doing a few hundred updates per seconds to 1 non-hypertable, a large batched insert (abt 10MB worth of data at a time) to 2 hypertables every few minutes and thousands of selects per second from one of those hypertables (so 3 tables total 1 normal + 2 hypertables). Each of those over long-running persistent connections.

Initially the RSS of the client connections that were updating non-hypertable was growing very rapidly and causing OOM (within hours)

I moved this non-hypertable to another postgres process (w/o TS extension), which as can be seen does not have RSS growing (last screenshot in previous comment). However I still see the client processes that do SELECT (and only select, nothing else) keep their RSS growing over time, but slower than with UPDATE.

Client connections doing INSERT seems to not have any RSS increase at all.

Another observation -- if I kill the application process that has one of those leaking connections -- the postgres client process may not exit, showing state as "idle" and still occupying the memory. At the same time gracefully closing connection seems to always release the postgres process correctly (thus releasing all used memory), but this needs more data before I can say that is always the case.

akamensky avatar Jan 16 '22 05:01 akamensky

@akamensky I tried to reproduce this at my end but without much success. I used pg_log_backend_memory_contexts($remote_data_node_connection_pid) before and after a large batched UPDATE transaction. The memory context grows during the transaction but resets back to the original value after the transaction completes. So PG memory context does not seem to be an issue here.

The RSS memory of the backend doing the remote UPDATE does seem to go up as time progresses. But that seems to happen with regular PG tables as well (I made sure that that DB does not have TS extension installed).

It will be great if you can provide a reproducible set of steps from your end.

nikkhils avatar Mar 19 '22 14:03 nikkhils

I've encountered this issue with insert calls, and I've created a python example that demonstrates it leaking around 1MB/sec on my database box.

This script will set up a demo hypertable and then start adding rows to it 1000 at a time. The rate of leakage can be adjusted by having larger or smaller rows; it doesn't look like memory is freed after each executemany completes.

The hypertable is set to chunk every 4 hours, and it will attempt to compress more or less when a chunk finishes. Retention is at 40 days worth, and the data is highly compressible, so on a reasonably sized drive this should be able to run indefinitely in the absence of memory leaks.

What I see instead is that the postgres worker thread handling these inserts eventually runs the system out of RSS and dies, forcing the database into a brief recovery mode.

The database is the docker image timescale/timescaledb:latest-pg14

import time
import turbodbc
from database import connect

# conn = turbodbc.connect(turbodbc_options=turbodbc.make_options(autocommit=True),
#                             driver="PostgreSQL Unicode",
#                             server="somewhere",
#                             port="5432",
#                             uid="user",
#                             pwd="password",
#                             UseDeclareFetch=1,
#                             UseServerSidePrepare=1,
#                             Fetch=10000)
conn = connect.connect()

c = conn.cursor()

print('setting up demo schema')
c.execute("""create schema if not exists demo""")

print('setting up demo clock')
c.execute("""
create or replace function demo.unix_epoch_nanos() returns bigint language sql stable as $$ 
select (extract(epoch from now())::numeric * 1000 * 1000 * 1000)::bigint 
$$;
""")

print('setting up demo table')
c.execute("""create table if not exists demo.memory_leak (
  timestamp bigint not null,
  what bigint not null,
  location bigint not null,
  data varchar not null
);
""")

print('setting up demo hypertable')
c.execute("""
select create_hypertable('demo.memory_leak', 'timestamp', chunk_time_interval => 4*3600000000000, if_not_exists=>true);
select set_chunk_time_interval('demo.memory_leak', 4*3600000000000);
select set_integer_now_func('demo.memory_leak', 'demo.unix_epoch_nanos', replace_if_exists => true);

alter table demo.memory_leak set (timescaledb.compress, timescaledb.compress_orderby = 'timestamp', timescaledb.compress_segmentby = 'what, location');
select remove_compression_policy('demo.memory_leak', if_exists=>true);
select add_compression_policy('demo.memory_leak', 1*3600000000000, if_not_exists => true);

select remove_retention_policy('demo.memory_leak', if_exists => true);
select add_retention_policy('demo.memory_leak', 86400000000000 * 40, if_not_exists => true);
""")

payloads = [
    "NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19\
  NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19NWmiOs7oNb5ZCX19",
    "sY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUl\
  sY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUlsY4Q62uqmsNhpFUl",
    "k2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfAC\
  k2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfACk2QFTDcgt96PvfAC",
    "mnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypd\
  mnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypdmnvKmLFUxJR26ypd"
]

rows = []

for x in range(1000000000000000):
    timestamp = time.time_ns()
    what = x % len(payloads)
    location = x % 10
    data = payloads[what]

    rows.append([timestamp, what, location, data])

    if len(rows) >= 1000:
        print("pushing ", len(rows))
        c.executemany("insert into demo.memory_leak values (?, ?, ?, ?)", rows)
        rows = []

bgemmill avatar Jul 19 '22 15:07 bgemmill

There is a memory leak in our function telemetry. As a temporary workaround you can disable function telemetry like so:

ALTER DATABASE <db> SET timescaledb.telemetry_level to no_functions;

svenklemm avatar Jul 20 '22 11:07 svenklemm

The issue reported by @bgemmill is different from the one initially reported here.

svenklemm avatar Jul 29 '22 07:07 svenklemm

Hello @bgemmill,

Thank you very much for the excellent script to reproduce the issue. I have run this script against the latest version of TimescaleDB and PostgreSQL 14. Using this TimescaleDB version, I don't see any memory leak. The memory consumption of PostgreSQL is stable in my environment. I monitored the memory consumption of the process for around 50 minutes and I didn't see any changes. In addition, I was monitoring the memory contexts using pg_log_backend_memory_contexts(PID) and they also show a stable memory consumption.

# 2 minutes after start
1180458 jan       20   0  207584 158588 154660 D  33.6   2.8   0:36.39 postgres

# After around 15 minutes
1180458 jan       20   0  207584 158588 154660 D  32.9   2.8   8:51.62 postgres

# After around 50 minutes
1180458 jan       20   0  207584 158588 154660 D  20.1   2.8  15:11.55 postgres                             

Can you reproduce the memory leak with the latest version of TimescaleDB?

jnidzwetzki avatar Sep 13 '22 08:09 jnidzwetzki

Hi @jnidzwetzki, thanks for having a look at that.

I'm still using a workaround that I put in place before the fix was released, and haven't gone back yet to see if the fix works for me. I'll comment here again when I get to that.

bgemmill avatar Sep 19 '22 14:09 bgemmill

Dear Author,

This issue has been automatically marked as stale due to lack of activity. With only the issue description that is currently provided, we do not have enough information to take action. If you have or find the answers we would need, please reach out. Otherwise, this issue will be closed in 30 days.

Thank you!

github-actions[bot] avatar Sep 19 '23 01:09 github-actions[bot]

Dear Author,

We are closing this issue due to lack of activity. Feel free to add a comment to this issue if you can provide more information and we will re-open it.

Thank you!

github-actions[bot] avatar Oct 19 '23 01:10 github-actions[bot]