openage icon indicating copy to clipboard operation
openage copied to clipboard

Reduce converter memory usage with on-demand read

Open heinezen opened this issue 5 years ago • 4 comments
trafficstars

Required skills: Python

Difficulty: Medium

The new converter (#1151) uses member objects to operate on data from the .dat files. While these provide the necessary functions for conversion, they are also significantly larger than the original dat file entries (20x - 400x the size). This results in a comparatively high memory usage, especially for DE2 which uses ~1.1 GB of memory for all dat file entries. Note that the converter does not create objects for all entries by default, but the size difference compared to the source is still significant.

A solution to this problem could be that we read certain structures into member objects on-demand during conversion, instead of all converting them during the initial read.

This would require:

  • A special ValueMember class that references an offset in the .dat file and a GenieStructure reference for the dataformat.
  • Manual loading and unloading of the structure from file
  • Dynamic loading and unloading of the structure from file
  • Identifying places in the converter where manual loading makes sense

heinezen avatar May 04 '20 22:05 heinezen

I wondering if we can use different approach to this problem. What if we can reorganize converter structure to convert data not all at one but convert them by groups like Units for instance.

We can imagine that if you will load only Units related it will use less memory to do convertion then we can load next data group for convertion.

Problem with this approach is that we need to refactor converter a lot and we need to provide value search method to GenieStructure in case of finding and loading data from gamespec file.

I think I can start working on proposition of this new structure, then we can discuss pros and cons of the solution.

marcinsobejko avatar Dec 27 '20 14:12 marcinsobejko

I wondering if we can use different approach to this problem. What if we can reorganize converter structure to convert data not all at one but convert them by groups like Units for instance.

This is what is done already, so your idea is actually the logical extensions of what has to be coded when implementing the issue :) When the unit line is started to be converted here, we would (manually) load its and all related structs before we process them.

The reason I want to do this on the member level is that not all information we need is stored in the units, e.g. the food amount for a farm is stored as a civ resource. These would have to be dynamically loaded on request because it is just too much micro-management to load every member that could be requested. If we keep the .dat file in memory (which should not consume much memory), dynamic loading should be reasonably efficient.

heinezen avatar Dec 27 '20 19:12 heinezen

Ok, thanks for your replay now I see that. I must say that this source code is quite large and it takes me some time to understand what to do ;)

marcinsobejko avatar Dec 28 '20 17:12 marcinsobejko

@marcinsobejko You can read the converter blogposts for an intruduction :)

  • https://blog.openage.dev/the-openage-converter-part-i-reading-data.html
  • https://blog.openage.dev/the-openage-converter-part-ii-preparations-for-conversion.html
  • https://blog.openage.dev/the-openage-converter-part-iii-convert.html
  • https://blog.openage.dev/the-openage-converter-part-iv-conclusions.html

heinezen avatar Dec 28 '20 18:12 heinezen