community-edition icon indicating copy to clipboard operation
community-edition copied to clipboard

Lower resource usage of ClickHouse in the default configuration

Open ruslandoga opened this issue 1 year ago • 11 comments

Collecting some ideas for now:

  • ttl_only_drop_parts = 1 on tables with TTL (do we even have them, logs maybe?)
  • https://github.com/plausible/analytics/discussions/4740#discussioncomment-11074630

ruslandoga avatar Nov 18 '24 15:11 ruslandoga

Do everything recommended in https://clickhouse.com/docs/en/operations/tips#using-less-than-16gb-of-ram?

salomvary avatar Nov 20 '24 14:11 salomvary

Restoring the changes that were introduced in https://github.com/plausible/community-edition/pull/13 but later removed helped

pcouy avatar Dec 04 '24 15:12 pcouy

👋 @pcouy

What if we keep query_log but remove metric_log and asynchronous_metric_log?

query_log is just too useful :)

ruslandoga avatar Dec 04 '24 16:12 ruslandoga

Something like this: https://github.com/plausible/community-edition/pull/196 (untested)

ruslandoga avatar Dec 04 '24 16:12 ruslandoga

What do we use query_log for ? Is it used by plausible itself, or is it just a nice thing to have as a server admin ?

I'm not available to test #196 right now

pcouy avatar Dec 04 '24 16:12 pcouy

It's not used in the app, but without it we can't help self-hosters debug slow queries, unfinished exports, etc.

ruslandoga avatar Dec 04 '24 16:12 ruslandoga

If it is only used for troubleshooting, how about making it togglable with an environment variable or single-line compose override ?

pcouy avatar Dec 04 '24 16:12 pcouy

Before deciding this, let's first test the performance impact of having it enabled by default :)

ruslandoga avatar Dec 04 '24 16:12 ruslandoga

I'm currently testing #196 and it had the immediate effect of nearly doubling clickhouse's RAM use : (from 225MB to 450MB). CPU time when idling, on the other hand, seems to be at the same level as with #195

Don't forget to upgrade clickhouse to the latest release, it had a larger impact on reducing CPU use than disabling logs

I'm still wondering why the plausible container itself went from idling at ~0.75% to ~1.25%. Is there any chance it's related to clickhouse also increasing it's CPU use at rest ?

pcouy avatar Dec 05 '24 09:12 pcouy

👋 @pcouy

Thank you for trying it out!

I think I'll merge #196 first and then consult the core team about upgrading the ClickHouse image. I think we can up it to 24.9.3.128 which the cloud version is also planning to switch to (https://github.com/plausible/analytics/pull/4861).

I'm still wondering why the plausible container itself went from idling at ~0.75% to ~1.25%. Is there any chance it's related to clickhouse also increasing it's CPU use at rest ?

Might be related, yes.

ruslandoga avatar Dec 05 '24 11:12 ruslandoga

After #196 and #197 my instance resource usage went from

$ docker stats --no-stream
CONTAINER ID   NAME                                 CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
7bb7e57304a2   plausible-ce-plausible-1             3.28%     336.5MiB / 3.731GiB   8.81%     3.96GB / 3.56GB   92.8MB / 111kB    26
ed6a1af6a246   plausible-ce-plausible_events_db-1   17.58%    937.9MiB / 3.731GiB   24.55%    21.3GB / 27.1GB   1.4GB / 910GB     711
86bfa98ba067   plausible-ce-plausible_db-1          0.52%     101.9MiB / 3.731GiB   2.67%     5.18GB / 3.14GB   28.8MB / 4.97GB   17

to

$ docker stats --no-stream
CONTAINER ID   NAME                                 CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
54c1ff2c6a81   plausible-ce-plausible_events_db-1   12.65%    278MiB / 3.731GiB     7.28%     1.42MB / 1.8MB    950kB / 463kB     720
d43427c25a6c   plausible-ce-plausible-1             3.13%     327.5MiB / 3.731GiB   8.57%     2.29MB / 2.11MB   6.8MB / 0B        27
7be10eb46de3   plausible-ce-plausible_db-1          0.33%     60.23MiB / 3.731GiB   1.58%     1.64MB / 1.19MB   18.3MB / 1.73MB   17

I'll check back on it tomorrow.


Update from Dec 6, 2024:

$ docker stats --no-stream
CONTAINER ID   NAME                                 CPU %     MEM USAGE / LIMIT     MEM %     NET I/O          BLOCK I/O         PIDS
54c1ff2c6a81   plausible-ce-plausible_events_db-1   18.12%    322.4MiB / 3.731GiB   8.44%     427MB / 543MB    21.7MB / 2.82MB   720
d43427c25a6c   plausible-ce-plausible-1             4.22%     328.2MiB / 3.731GiB   8.59%     607MB / 533MB    6.8MB / 0B        27
7be10eb46de3   plausible-ce-plausible_db-1          7.84%     76.66MiB / 3.731GiB   2.01%     105MB / 63.9MB   18.5MB / 97.5MB   17

And after https://github.com/plausible/community-edition/pull/198

$ docker stats --no-stream
CONTAINER ID   NAME                                 CPU %     MEM USAGE / LIMIT     MEM %     NET I/O          BLOCK I/O         PIDS
87cf7f1ce570   plausible-ce-plausible_events_db-1   6.40%     348.1MiB / 3.731GiB   9.11%     710kB / 913kB    95.4MB / 377kB    725
d43427c25a6c   plausible-ce-plausible-1             3.80%     324.3MiB / 3.731GiB   8.49%     1.18MB / 922kB   35MB / 0B         26
7be10eb46de3   plausible-ce-plausible_db-1          1.47%     58MiB / 3.731GiB      1.52%     106MB / 64.6MB   18.5MB / 98.4MB   17

Update from Dec 9, 2024:

ClickHouse increased CPU usage

$ docker stats --no-stream
CONTAINER ID   NAME                                 CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
87cf7f1ce570   plausible-ce-plausible_events_db-1   107.50%   585.7MiB / 3.731GiB   15.33%    1.75GB / 2.25GB   131MB / 641MB     725
d43427c25a6c   plausible-ce-plausible-1             3.93%     328.5MiB / 3.731GiB   8.60%     2.52GB / 2.2GB    35.1MB / 0B       26
7be10eb46de3   plausible-ce-plausible_db-1          0.50%     78.81MiB / 3.731GiB   2.06%     532MB / 323MB     19.5MB / 509MB    17

I'll try reverting https://github.com/plausible/community-edition/pull/198

ruslandoga avatar Dec 05 '24 12:12 ruslandoga