supabase-grafana icon indicating copy to clipboard operation
supabase-grafana copied to clipboard

CPU usage seems off

Open savikko opened this issue 1 year ago • 19 comments

Hi!

Starting the container, it connects successfully to supabase instance but it seems that CPU usage is not telling what it should tell:

kuva

So it seems to spike to 100% but that does not really happen, comparing to load: kuva

Also, sometimes "basic cpu" goes to minus something: kuva

savikko avatar Sep 25 '24 07:09 savikko

It's not at 100% it's idle, the problem is before on previous version the value where constant and not missing some point as you can see there is some 0 point

riderx avatar Nov 05 '24 15:11 riderx

It's not at 100% it's idle, the problem is before on previous version the value where constant and not missing some point as you can see there is some 0 point

@riderx I Got the same problem (sometimes "basic cpu" goes to minus something),can you tell me how to resolve it? thank you so much

begank avatar Jan 14 '25 12:01 begank

I don’t have a way to resolve it yet, but maybe i can ask on X

riderx avatar Jan 14 '25 12:01 riderx

Hi folks - just want to confirm some data points.

Is this running on Fly.io, or a dockerized setup? We have a theory that network latency might play a role in the discrepancies which are displayed - does reducing the scrape config to 45s down from 60s alleviate this?

pcnc avatar Feb 11 '25 12:02 pcnc

@pcnc i changed it and did fly deploy and no better, updated at 13:04

Image

honestly i'm not sure the config was update properly, i just don't want to loose all past data by deleting all Image

riderx avatar Feb 12 '25 13:02 riderx

Image After a while i realized it did improve a bit, now the chart almost not stay at 0

riderx avatar Feb 12 '25 13:02 riderx

I see, thank you! We've managed to reproduce the behaviour and are actively looking into this

pcnc avatar Feb 12 '25 14:02 pcnc

Note that this does not happen when self-hosting/using via docker. I self host grafana and prometheus. Using the default dashboard, the metrics look good when querying with a 60s refresh:

Image

encima avatar Feb 13 '25 12:02 encima

Also it was not happening in FLY.io last year, then with one update this started to happen

riderx avatar Feb 13 '25 12:02 riderx

This could also be scraping options configured to scrape more often. Docs on the Supabase site have been updated, as well as the README in this repo. Let us know if it is still causing issues!

encima avatar Feb 27 '25 10:02 encima

This still seems to be an issue for me with the 5-minute and 30-minute timers. Before it was showing 100 usage, and then 30 minutes later when I came back to it, now it shows -77.2. There's no consistency here, and just looks very inaccurate when I compared it to the actual Supabase report.

Image Image Image

I am hosting this on my local network and scraping the metrics with Prometheus and the dashboard is on Grafana.

shreyashguptas avatar Mar 14 '25 17:03 shreyashguptas

Hi @shreyashguptas

This should be fixed with #45 and #43 as the latest versions have been deployed and the dashboard has been updated. When they are both on main, please try again and let us know!

encima avatar Apr 04 '25 10:04 encima

i pulled the last version, why all deployment option have been removed from the readme ?

riderx avatar Apr 08 '25 10:04 riderx

Hey @riderx We have updated the README to give the options for self-hosting this and call out integrations. More detailed instructions are in our Metrics docs

We found some users running this in Fly (or other providers) in Production so we removed those instructions to prevent that behaviour.

This repo is for the distribution of the dashboard, showcasing the metrics endpoint and providing a quick, runnable example but it is not for production use.

If you have ideas on how this could be improved and still made clear, let us know!

encima avatar Apr 09 '25 17:04 encima

I'm not sure to get it? I was using grafana cloud, then you recommended fly.io so switched to it, now fly is not good anymore ? i use it for prod of course... what should i use

riderx avatar Apr 10 '25 13:04 riderx

Hey @riderx We have updated the README to give the options for self-hosting this and call out integrations. More detailed instructions are in our Metrics docs

We found some users running this in Fly (or other providers) in Production so we removed those instructions to prevent that behaviour.

This repo is for the distribution of the dashboard, showcasing the metrics endpoint and providing a quick, runnable example but it is not for production use.

If you have ideas on how this could be improved and still made clear, let us know!

Hey I just checked and for the Dashboard I updated the Grafana dashboard with the latest JSON and still the CPU shows 100% use when in the screenshot you can see from Supabase that it is not the case.

Image

Image

shreyashguptas avatar Apr 13 '25 01:04 shreyashguptas

Hi folks - just want to confirm some data points.

Is this running on Fly.io, or a dockerized setup? We have a theory that network latency might play a role in the discrepancies which are displayed - does reducing the scrape config to 45s down from 60s alleviate this?

@pcnc maybe config scrape_interval to 15s can resolve this problem? Shorter intervals can grab more data without causing gaps in the middle

begank avatar May 28 '25 08:05 begank

i personally gave up and returned to grafana cloud it work and no need to pay for it

riderx avatar May 29 '25 01:05 riderx

Using the latest dashboard and scrape job, are you still seeing this?

encima avatar Aug 27 '25 11:08 encima