gopher-lua icon indicating copy to clipboard operation
gopher-lua copied to clipboard

Table alloc tracking

Open tul opened this issue 4 years ago • 7 comments

Fixes #250

Changes proposed in this pull request:

  • LStates can now optionally limit how many LTables and how many LTable keys are set
  • This provides basic memory quotas for LStates as LTable growth is the most common way for an LState to grow in memory use
  • Defaults to not being tracked
  • When enabled - has almost negligible speed and memory impact

tul avatar Jan 07 '20 16:01 tul

Coverage Status

Coverage increased (+1.2%) to 89.286% when pulling 0c66945379a5fb25bc8b3f0cf0076fb75f15ec6f on tul:table_alloc_tracking into ab39c6098bdba5b34ac9a025d375d4cf5bed4814 on yuin:master.

coveralls avatar Jan 07 '20 16:01 coveralls

@yuin what do you think? Happy to hear your thoughts.

tul avatar Jan 31 '20 21:01 tul

@tul This is fantastic work. Would love to see this included in gopher-lua. What do you think @yuin?

novabyte avatar Mar 03 '20 23:03 novabyte

Awesome work and great contribution @tul! I'm very sorry about this.

I'm very busy with my work until end of March. I will look this PR once things settle down at work.

yuin avatar Mar 12 '20 17:03 yuin

@yuin Apologies, I have only just seen your comments. I will review them and respond next time I'm in the code area.

I also have a further update to this PR to optionally force a GC when a quota is exceeded, as I found that Go's lazy GC would sometimes lead to Lua scripts with a high churn of LTables exceeding quota just because the GC hadn't freed its unreferenced LTables yet.

tul avatar Sep 11 '20 08:09 tul

@tul Thanks for the patch. Any update on getting this over the line?

hbagdi avatar Jul 14 '21 22:07 hbagdi

To be honest after having used this in production for a while I’m considering alternative approaches.

The problems I’ve had with this approach:

  1. scripts which churn a lot of memory quickly, eg during initialisation they create a lot of garbage, can be incorrectly killed for being over quota (they are only over quota due to GC not yet having fired)
  2. Putting in logic to fire the GC manually when quotas are exceeded (not in this patch but I have locally) hit performance very hard. It may be possible to optimise it by coalescing and scheduling GCs, but I have doubts.
  3. When a script does exceed its quota, often due to a leak, this PR offers no insights into “where” the unreleased memory is being retained.

So in short, I’m thinking of alternative approaches, possibly a combination of light weight finaliser based counting like this, combined with registry analysis to determine where tables are retained (essentially allowing a per LState “retention graph” to be produced).

tul avatar Jul 18 '21 13:07 tul