timescaledb-tune
timescaledb-tune copied to clipboard
work_mem value can be bad
The code that calculates work_mem can produce a value that is below the minimum threshold of 64kB. This will prevent timescale/postgress from starting. I have experienced this issue deploying to a k8s node w/ 32 Cores and 120GB Ram. The output from timescale-tune was 41kB which causes postgres to fail to start.
Yikes, that's no good. Did you set any other flags I should know about?
Will definitely look to push a fix ASAP
I ask because when I try
timescaledb-tune --memory=120GB --cpus=32
I get this for memory settings
Recommendations based on 120.00 GB of available memory and 32 CPUs for PostgreSQL 11
shared_buffers = 30GB
effective_cache_size = 90GB
maintenance_work_mem = 2047MB
work_mem = 19660kB
So I wonder how it ended with 64KB
EDIT: Fixed memory flag
No flags were added and it gave me a value of 41k so it would not run
On Mon, Mar 4, 2019, 6:50 PM RobAtticus [email protected] wrote:
I ask because when I try
timescaledb-tune --memory=32GB --cpus=32
I get this for memory settings
Recommendations based on 120.00 GB of available memory and 32 CPUs for PostgreSQL 11 shared_buffers = 30GB effective_cache_size = 90GB maintenance_work_mem = 2047MB work_mem = 19660kB
So I wonder how it ended with 64KB
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/timescale/timescaledb-tune/issues/38#issuecomment-469514757, or mute the thread https://github.com/notifications/unsubscribe-auth/AhMe_CRj85Gm5jUP3J2Xk2csaVW_Doerks5vTdttgaJpZM4bdfAH .
Happy to correct the issue with it returning invalid values, but I am a bit worried that is is misreading your settings since it should not be giving 41KB for the given parameters (120GB RAM, 32 CPUs).
It would be useful if you could run the following commands from inside the container:
cat /sys/fs/cgroup/memory/memory.limit_in_bytes
and
free -m | grep 'Mem' | awk '{print $2}'
Thanks for the bug report
Here are the results of the commands you sent. Although this machine has a lot of resources, GKE probably slices it differently which produces those results.
bash-4.4$ cat /sys/fs/cgroup/memory/memory.limit_in_bytes
268435456
bash-4.4$ free -m | grep 'Mem' | awk '{print $2}'
120873
On Tue, Mar 5, 2019 at 8:52 AM RobAtticus [email protected] wrote:
Closed #38 https://github.com/timescale/timescaledb-tune/issues/38 via #39 https://github.com/timescale/timescaledb-tune/pull/39.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/timescale/timescaledb-tune/issues/38#event-2181576414, or mute the thread https://github.com/notifications/unsubscribe-auth/AhMe_EzwcakF03OfZDQ9-dBKGOOgcNYWks5vTqDPgaJpZM4bdfAH .
This got closed by the merge but there seems to be another problem at play here.
Specifically, 268435456
is only ~268MB, so your settings are based off that rather than the 120GB actually available on the machine. Do you have any insight as to why the machine would be giving that as the limit?
bump @roger-rainey
The memory number was coming from kubernetes request memory which is not the memory limit.
On Mon, Mar 11, 2019 at 2:43 PM RobAtticus [email protected] wrote:
bump @roger-rainey https://github.com/roger-rainey
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/timescale/timescaledb-tune/issues/38#issuecomment-471742699, or mute the thread https://github.com/notifications/unsubscribe-auth/AhMe_NZk3mdQPpyLbCUSJnUh-n0QnRXoks5vVs3ygaJpZM4bdfAH .
@roger-rainey That's intriguing. The cgroups memory.limit_in_bytes
should refer to the maximum memory allocated to the container, not the minimum requested. I could see it being possible that the two would match up if the node the container is scheduled on only has request_memory
available, but with such a specific number that seems kind of weird.
Would you mind posting your k8s configuration for this pod, and any other information you have about resource utilization on the node your pod is on?
If it seems like there's still more than request_memory
available, I might have to do a deeper dive on how the cgroups
memory limits get set when request
settings are specified in k8s.
One other useful bit of info might be the output of
cat /sys/fs/cgroup/memory/memory.stat
@roger-rainey Did you have any follow up on this re: @LeeHampton 's comment?