growstuff icon indicating copy to clipboard operation
growstuff copied to clipboard

We're frequently exceeding our RAM quota on production.

Open pozorvlak opened this issue 6 years ago • 23 comments

Do we know which page?

Br3nda avatar May 12 '18 03:05 Br3nda

Might be #1638

Br3nda avatar May 13 '18 05:05 Br3nda

They could definitely be related, but I think the causality might go the other way - exceeding our RAM quota means we're hitting swap, which will slow page-loads way down. Optimising queries may very well help, but adding caching is unlikely to reduce our RAM usage :-)

pozorvlak avatar May 14 '18 09:05 pozorvlak

i didn't think heroku gave you swap - in my experience it kills the app when you exceed the allocated resources you've paid for.

Br3nda avatar May 15 '18 21:05 Br3nda

I think the blue dot and sudden drop in RAM usage at 8am in the graph above is us getting killed for exceeding our quota too flagrantly. But from https://devcenter.heroku.com/articles/ruby-memory-use#why-memory-errors-matter:

If you’re getting R14 - Memory quota exceeded errors, it means your application is using swap memory. Swap uses the disk to store memory instead of RAM. Disk speed is significantly slower than RAM, so page access time is greatly increased. This leads to a significant degradation in application performance. An application that is swapping will be much slower than one that is not. No one wants a slow application, so getting rid of R14 Memory quota exceeded errors on your application is very important.

So we'll be swapping at least for a while rather than being killed immediately.

pozorvlak avatar May 16 '18 15:05 pozorvlak

I notice the whole app runs on one dyno - would using a worker be possible? Or would that create a budget problem?

pmackay avatar Jan 06 '19 16:01 pmackay

i think the individual dyno would still exceed it's RAM quota.

There is a new relic - there may be clues on there on which pages/code is being memory hungry

Br3nda avatar Jan 07 '19 00:01 Br3nda

@Br3nda That's right, on the Common Runtime, where the app runs, dynos can use swap when exceeding memory usage. Hobby dynos run on multitenant machines, which can sometimes mean performance greater than expected if the host is underutilized by competing Heroku apps.

@pmackay If the app has or begins to emit H12 timeout errors on Heroku, then it's likely time to think about adding a background process if budget allows. Using something like rack-timeout can also provide an interim reprieve. The Scout add-on is free and is specifically designed for identifying memory issues and N+1 queries in Rails apps. That combined with New Relic should provide a decent start.

There are a few other approaches to tackle memory issues, but I thought I'd start with the above info!

apdarr avatar Jan 24 '19 15:01 apdarr

@apdarr Thanks!

pozorvlak avatar Mar 22 '19 12:03 pozorvlak

The biggest memory user so far is the comfy Mexican sofa (on staging). Let's check again in a week

Br3nda avatar Mar 27 '19 10:03 Br3nda

Scout reports our biggest memory use on production is consistently in Comfy::Cms::ContentController#show

Br3nda avatar Apr 13 '19 20:04 Br3nda

Perhaps more interesting is the 2s to 5s response times for HarvestsController#index

Br3nda avatar Apr 13 '19 20:04 Br3nda

this is still happening. the crops hierarchy page is the current biggest memory user

Br3nda avatar Jul 16 '19 05:07 Br3nda

The crops hierarchy page ought to be mostly cached - I wonder if the cache is expiring, or if we're somehow not hitting it? But yeah, I can see how it would be heavy - we call Crop.toplevel in the controller, then crop.varieties recursively in the view, so it's going to be doing a lot of querying. Is there some way we can pre-populate the varieties field in the initial query? Or fetch Crop.all in the controller and construct the varieties tree in memory? Or maybe we could show all the top-level crops collapsed by default and fetch them lazily using AJAX?

pozorvlak avatar Jul 16 '19 10:07 pozorvlak

This gem should give us some useful caching for descendants https://github.com/stefankroes/ancestry

Br3nda avatar Jul 16 '19 20:07 Br3nda

Fixed memcache at 9am. It's been looking good since. Screenshot_20191207_154015 Screenshot_20191207_153939 Screenshot_20191207_153954

Br3nda avatar Dec 07 '19 02:12 Br3nda

Awesome!

pozorvlak avatar Dec 07 '19 17:12 pozorvlak

I'd like to get it under 500M before closing this

Br3nda avatar Dec 07 '19 19:12 Br3nda

Screenshot_20191209_145024

Br3nda avatar Dec 09 '19 01:12 Br3nda

there's a memory leak somewhere

Meanwhile the home page is now very very fast. I'll go through the other major pages and gets them nice as well.

Br3nda avatar Dec 09 '19 05:12 Br3nda

Still sitting at 700MB

Screenshot_20191214_100518

Br3nda avatar Dec 13 '19 21:12 Br3nda

i think we've got it under 500Mbyte with the deployment of #2320 . Will watch it for a few more days image

Br3nda avatar Jan 14 '20 00:01 Br3nda

Has crept back up a bit: image

CloCkWeRX avatar Feb 04 '24 03:02 CloCkWeRX