OpenBVE icon indicating copy to clipboard operation
OpenBVE copied to clipboard

Bug - Routeviewer Eating memory after reloading route a dozen or so times

Open Dragon951 opened this issue 3 years ago • 12 comments

Description

Routeviewer Eating memory after reloading route a dozen or so times

Reproduction

IT Occurs in any route if you keep F5 Multiple times

Route

Any Route

Train

Any Train

Logs

No Crashes but if not monitored in Task Manager, routeviewer can eat all your system memory and after a while crash your pc

Related information

  • Operating system - windows 11 - 22h2
  • Method of control (Keyboard, joystick, RailDriver) - Keyboard image image

Dragon951 avatar Oct 28 '22 04:10 Dragon951

I've never been able to reproduce anything like this sort of memory leak :( s520 also looked, but never found anything either.

https://github.com/leezer3/OpenBVE/commit/9c067fd4e761792357870f65a7e6c099be97319a will solve a couple of very minor leaks, but nothing at all like you're seeing, more in the neighbourhood of a megabyte or two per reload (nothing at all interesting....) This suggests to me it's either not any route, or there's something system specific at work.

Have you got a link to a public route which reliably leaks memory on your system? Your graphics card would also be interesting, as it could well be driver related.

leezer3 avatar Oct 28 '22 10:10 leezer3

The Route Viewer is leaking memory like that for quite a while already (#784), the memory doubles on every reload. (As if the memory from the previous scene did not get freed) Seems like I can reproduce it consistently regardless of route, GPU and OS as well

Interestingly I can only get the 64-bit one to memory leak on Windows (at the time), so I just continue using the 32-bit version But right now on Linux (Kubuntu to be specific) running it under mono, both 32-bit and 64-bit leaks. image

AmberIsFrozen avatar Oct 28 '22 15:10 AmberIsFrozen

Only thing I can think of doubling would be the textures, everything else is comparative peanuts. Can't see why these would fail to unload either....

Nothing odd like content on network / USB drives was there? That feels filesystem related.

leezer3 avatar Oct 28 '22 16:10 leezer3

It's just loaded from another HDD. I initially suspected it's something to do with the x64 build change (Since I spent most of the time testing in-game rather than Route Viewer) Although not sure whether that's the case.

The Route Viewer closes fine here though, and each memory increase is consistent (Unlike ginga's one)

Maybe I'll do some testing later to trace the specific commit when I have time.

AmberIsFrozen avatar Oct 28 '22 17:10 AmberIsFrozen

That would be interesting.

What would also be interesting if you're running in a debugger which can do memory profiling (Visual Studio does this, not sure about MonoDevelop) is which objects are actually leaking. If you take a memory snapshot before and after the doubling, and then click on the objects diff number, this gives an ordered list of the number of additional objects.

leezer3 avatar Oct 28 '22 18:10 leezer3

Also, as noted in the other issue, if you're using the distribution provided Mono, please try the direct Mono repository version, this is somewhat newer, and there's at least one memory leak bug which got solved in newer versions.

leezer3 avatar Oct 28 '22 18:10 leezer3

That would be interesting.

What would also be interesting if you're running in a debugger which can do memory profiling (Visual Studio does this, not sure about MonoDevelop) is which objects are actually leaking. If you take a memory snapshot before and after the doubling, and then click on the objects diff number, this gives an ordered list of the number of additional objects.

image It's not revealing anything special either, it's something else... (I can still confirm that the memory leak does not occur with 32-bit Route Viewer, only x64 on Windows)

What's more interesting is that I cannot get anything below v1.8.2.0 to launch Route Viewer with 64-bit, it always launches with 32-bit. Although I think that's irrelevant to the memory leak

The leak seems to start happening on 1.8.2.1, so probably something between 1.8.2.0 and 1.8.2.1 *Happened after the Animated GIF got merged, works fine before that so presumably it would be the animated gif PR Although I can't get those to compile on my machine so

AmberIsFrozen avatar Oct 29 '22 05:10 AmberIsFrozen

I'd agree that shows nothing at all interesting, which strongly suggests we're hitting a .Net edge case somewhere.

https://github.com/leezer3/OpenBVE/commit/3f046db32aadea6c14d0a9b179459eda0231b77a

That might help a little, but I'd only expect this to matter at all if you're using large numbers of GIF textures.

leezer3 avatar Oct 29 '22 17:10 leezer3

Something I might have found: Please try this with a route containing no non-english characters in station names / messages (filenames should be ok) and see if it leaks.

leezer3 avatar Oct 29 '22 19:10 leezer3

Still leaks on western route, I've even tried that on the Animated Object Demo Route (Though only leaks at around 10-20MB each) Odd. (It's probably not related to the animated gif decoder itself?)

AmberIsFrozen avatar Oct 30 '22 06:10 AmberIsFrozen

I don't know what the message means, but the file name contains Japanese, but I changed the name of the station to English and read it 10 times. 6.5GB 6.56 6.59 6.59 7.35 8.39 9.36 10.3 11.2 12.1GB

ginga81 avatar Oct 30 '22 16:10 ginga81

Still not able to reproduce any significant leaks :( There's something funny happening with non-english charsets which seems to leak a small amount which I haven't got to the bottom of, but that obviously didn't help much.

Made some minor changes, which are I suspect unlikely to help (removed the original restricted pallette code, as this path is only used by PNGs now), and added a forced compaction of the large object heap on reload.

I'll try and see if I can find anything on Linux next week sometime, but as far as I can see at the minute, this isn't our bug,

leezer3 avatar Oct 30 '22 19:10 leezer3

The memory graph might also be interesting to look at (Also 64bit seems to use double the memory 32bit is using)

32bit: image

64bit (The leaking one): image

There's suppose to be 2 big dip following the several seconds after reloading, but the 64bit one only did 1 small dip and 1 big dip Not sure how much relevancy it has to the leak though

AmberIsFrozen avatar Nov 01 '22 12:11 AmberIsFrozen

I'm not seeing anywhere near double the memory usage, or a graph that looks anything like that :( Was that Linux or Windows?

64-bit will (usually) use more memory (as every address is stored as 8 bytes not 4; https://superuser.com/questions/306751/64bits-vs-32bits-processors-memory-consumption ), but the exact memory usage will depend a little on how the .Net garbage cleaner decides to invalidate objects after a reload.

As a secondary note, on Linux, both executables should execute in 64-bit regardless, and should be functionally identical. (the only change is a modification to the PE header) See this issue on the Mono repository: https://github.com/mono/mono/issues/21038

leezer3 avatar Nov 01 '22 19:11 leezer3

glitch

Hmm, that isn't good. (Can't explain how that's happening at the minute, but if you sit on the signal for the demo route, there's a duplicated second arm for a second as it rotates from the inital position after reload)

That smells related.

leezer3 avatar Nov 02 '22 18:11 leezer3

Can't reproduce that glitch again, which is a little worrying.....

However, I've got a way to make it leak now. Hold down F5, continually triggering a reload and it'll leak.

Observations:

  • This appears to be unrelated to the GIF decoder. If I just bail out on starting the texture load plugin, we still leak, so the leak is likely coming from somewhere in the API texture type. This is somewhat messy (the whole TextureOrigin concept requires multiple transformations of refs), but I can't find anything immediately.
  • 6b569400ff75ff7129f857d5dd57ac3d6c1691da appears to be the first place the leak really starts to show. This is likely because it switches to multi-dimensional arrays here- Not the actual cause?
  • Per above, temporarily reverting to a single frame based texture isn't really helping.
  • The leak is in unmanaged memory. Nothing wants to tell me what it actually is; several memory profilers are just saying unknown....
  • If I reload enough with the 1.4.2 Route Viewer, this seems to leak a little too.

leezer3 avatar Nov 03 '22 14:11 leezer3

I tested what the daily-build version happens. 20211117 not leak 20211120 leak So I guess I got up with the changes during this time.

ginga81 avatar Nov 03 '22 17:11 ginga81

yea, im also using the daily builds and i feel the memory bug is still happening / same

Dragon951 avatar Nov 03 '22 17:11 Dragon951

If I'm reading the commit logs right, the only change then was this: https://github.com/leezer3/OpenBVE/pull/698

Only thing I can see that leaking with is lots of trains / TFOs, which doesn't really fit with what KennyHui is seeing with the demo route leaking.

leezer3 avatar Nov 03 '22 18:11 leezer3

I didn't only downloaded 20211117 and 20211120 at the time, and even if I try to download the daily build now, I can only download 2022.1.1 or later. 2121.11.18 or 19, I can't check if it was changed, but was it changed in the last two days?

ginga81 avatar Nov 04 '22 15:11 ginga81

Yes I've checked several days ago that it is indeed unrelated to the GIF decoder itself, just haven't got around to reply yet. (All the testing I've done so far is on Windows) It seems to leak in virtually all routes which is very odd

Though interestingly it does not leak on my Windows 7 VirtualBox VM, and this is the only "machine" where it does not leak memory. (Even VBox on Win10 leaks) The memory graph is also completely different to my testing on other VM/Host OS image It also only uses ~100MB on my (well, unreleased routes), while on my other VM/host it uses up to 200MB on load. Maybe I'll setup another VM later with the same config as the Win7 one and see what caused it

AmberIsFrozen avatar Nov 04 '22 15:11 AmberIsFrozen

There should have been no builds between the 18th and the 20th https://github.com/leezer3/OpenBVE/commits/master?after=6ac6489e6bc145d090d439dafdef0c3f71d3def6+209&branch=master&qualified_name=refs%2Fheads%2Fmaster

Trouble with keeping builds long-term is that there isn't unlimited disk space. (20gb total on the VPS) I normally clear them after 6-7 months or so.

leezer3 avatar Nov 04 '22 15:11 leezer3

btw Is there a way to obtain the latest nightly build (Programmatically) other than to scrap the nightly build site? Ideally I would like to setup a script to auto-download the latest nightly build (Or maybe a batch script to setup a debugging environment that I can just run it on VM and get everything ready), but I can't seems to find a way yet

AmberIsFrozen avatar Nov 04 '22 15:11 AmberIsFrozen

Nope, not at the minute, never thought of that :( Could probably setup a job on the server to update an XML whenever it's pushed & then use much the same as the main menu update button, but that would be a future 'plan'.....

leezer3 avatar Nov 04 '22 15:11 leezer3

Nope, it just leaks on Win10, exact same configuration. Discovered something else though, the memory freed properly when the settings in the "Options" dialog is changed and need reloading.

Ended up with this: https://github.com/leezer3/OpenBVE/blob/6ac6489e6bc145d090d439dafdef0c3f71d3def6/source/RouteViewer/ProgramR.cs#L393

https://github.com/leezer3/OpenBVE/commit/1896d1e689ead4b3dc84155debaed15b7e2b472f (Apr 13) introduced this I think, and I opened an issue on Apr 28, the time frame seems to make sense

Replacing it with Renderer.Reset() just fixes it for me, I assume there won't be any side effect(?).

The memory graph is still different and uses more memory on my machine, maybe they changed the way garbage collection works in later version of .NET Framework? (IIRC I only setup 4.6.2 on my Win7, while Win10+ comes with 4.8), but either way I think the above code is the core issue, just some .NET version coincidentally handled the leak

AmberIsFrozen avatar Nov 04 '22 16:11 AmberIsFrozen

I'm on windows 11 and it's leaking

Dragon951 avatar Nov 04 '22 17:11 Dragon951

Replacing it with Renderer.Reset() just fixes it for me, I assume there won't be any side effect(?).

The memory graph is still different and uses more memory on my machine, maybe they changed the way garbage collection works in later version of .NET Framework? (IIRC I only setup 4.6.2 on my Win7, while Win10+ comes with 4.8), but either way I think the above code is the core issue, just some .NET version coincidentally handled the leak

Now there's an interesting observation. https://github.com/leezer3/OpenBVE/blob/master/source/LibRender2/BaseRenderer.cs#L476

The reset function first purges the object and texture caches, then calls the Initialize() function. That would suggest that it's the order the reset is done in that causes the glitch.

leezer3 avatar Nov 04 '22 17:11 leezer3

Yep, that helps quite a bit with the F5 reload leak I can actually reproduce.

Memory leak is reduced to about the same as 1.4.2 with that change. I think the remaining leak is probably the font thing I haven't got to the bottom of yet.

@ginga81 please see if the build from today helps.

leezer3 avatar Nov 04 '22 17:11 leezer3

Most recent build from today also adds a (hidden) option to check for nightly build updates.

For this, please manually edit your options.cfg file, and set dailybuildupdates (In the Interface section) to true. The check for updates function will then look at nightly builds as opposed to release builds.

Not sure where / how to put a GUI checkbox for this yet, we want something like an update channel concept (e.g. stable / nightly channels)

leezer3 avatar Nov 04 '22 18:11 leezer3

I checked the newest daily build. It is not happen leaks. I was reload 10 or more. But it seems not increase.

ginga81 avatar Nov 04 '22 19:11 ginga81