OTBM2JSON icon indicating copy to clipboard operation
OTBM2JSON copied to clipboard

[FEATURE] onReadItem and onReadTile events

Open gesior opened this issue 6 years ago • 8 comments

Current version first read whole map into RAM (UP TO FEW GB RAM). Then let me do some advanced operations on map. Then I can save my modified map.

There are many functionalities for which I don't need whole map: read house tiles, find some item position It would be nice, if I could run it with few MB RAM (remove 'tile' from memory just after it's loaded with items).

  1. After loading item:
(...)
const mapReader = new MapReader();
mapReader.on('readItem', function(tile, item)
{
  if(item.id == 1387) {
    console.log('found teleport', tile.pos);
  }
}
);
mapReader.process('input.otbm');
  1. After loading tile with items and properties:
mapReader.on('readTile', function(tile) {});

gesior avatar Mar 12 '19 15:03 gesior

Hi, reading the map chunked is quite trivial I think but transforming inlcding writing is harder. I started on this branch: https://github.com/Inconcessus/OTBM2JSON/tree/add-stream-reader. See examples/stream for a demo.

Inconcessus avatar Mar 12 '19 18:03 Inconcessus

I made some more changes using a simple synchronous callback instead of using an event emitter. The problem I encountered is that the application uses a depth-first approach. We get the top level element last and have to write it to file first. So just piping it doesn't really work out.

This transformation approach takes every OTBM_TILE_AREA, OTBM_WAYPOINTS, OTBM_TOWNS and converts it back to binary representation before continuing. In the end everything is concatenated and written to file. That should save you a lot of memory.

If you want to give it a test just clone git repository and checkout the new branch.

git checkout add-stream-reader

Inconcessus avatar Mar 12 '19 20:03 Inconcessus

Look for examples/stream for a demonstration of the transformation API. You have to define a transformation callback function that is applied for every feature encountered. Remember to return the feature after modifying it!

Inconcessus avatar Mar 12 '19 20:03 Inconcessus

I tested this on a 25MB OTBM file and the memory usage went from 300MB to 50MB. Let me know what your findings are!

Inconcessus avatar Mar 12 '19 21:03 Inconcessus

I tested this on a 25MB OTBM file and the memory usage went from 300MB to 50MB. Let me know what your findings are!

Just tested with 120MB OTBM on [email protected] [all cores], 32 GB RAM, Samsung NVME SSD (read/write 1.5-3.5GB/s).

'time' command with RAM and CPU usage measurement:

/usr/bin/time -v node big.js

Basic (for that version I had to increase default ~2GB node RAM limit: node --max_old_space_size=8000 big.js ): CPU use: 139% (1.39 of 1 core) Elapsed time: 48.23 sec Peak RAM: 2987 MB

Stream: CPU use: 113% (1.13 of 1 core) Elapsed time: 41.61 sec Peak RAM: 1628 MB

I added some 'global.gc()' for test, but it only increased time of execution. Almost zero RAM peak usage change.

JSkalskiSBG avatar Mar 13 '19 11:03 JSkalskiSBG

So there's a big improvement but not as much as I would expect. The streaming function handles one OTBM_TILE_AREA at a time (and when it is completed, it is converted back to OTBM). Within a tile area there may be many items, and perhaps (protection) zones that take a up lot of space since in the JSON representation they have very long keys:

// Read individual tile flags using bitwise AND &
return {
  "protection": flags & HEADERS.TILESTATE_PROTECTIONZONE,
  "noPVP": flags & HEADERS.TILESTATE_NOPVP,
  "noLogout": flags & HEADERS.TILESTATE_NOLOGOUT,
  "PVPZone": flags & HEADERS.TILESTATE_PVPZONE,
  "refresh": flags & HEADERS.TILESTATE_REFRESH
}

Can you add in the transformation routine:

console.log(JSON.stringify(feature).length);

Then we can get an estimate of the size of a single tile area in JSON representation.

Inconcessus avatar Mar 13 '19 12:03 Inconcessus

If memory remains a huge problem we can always write the completed features to disk and compile them afterwards using fs.createReadStream. As of now they are kept fully in memory for the sake of simplicity.

Inconcessus avatar Mar 13 '19 13:03 Inconcessus

There must be some memory leak in stream algorithm. It's not just 1 jump to 1.6GB. It starts from 130MB and grows slowly to 1.6GB. That's why I tried 'global.gc()' every 500 TILE_AREAS to make sure that GC is run. Is there any other array that stores some information about TILE_AREAS or TILES?

I ran Inspector and it showed that most of time app spent on copying FastBuffer and some FastBuffer 'buffer' is thing that grows to 1.6GB in RAM.

JSkalskiSBG avatar Mar 13 '19 16:03 JSkalskiSBG