MaxMind-DB-Reader-java icon indicating copy to clipboard operation
MaxMind-DB-Reader-java copied to clipboard

Database reader does not work for mmdb files larger than 2gb

Open nonetallt opened this issue 1 year ago • 4 comments

Trying to create a new reader (with or without a cache) for a 2,5gb database:

new Reader(file, new CHMCache());

Results in the following exception:

java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE
	at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1183)
	at com.maxmind.db.BufferHolder.<init>(BufferHolder.java:31)
	at com.maxmind.db.Reader.<init>(Reader.java:119)
	at com.maxmind.db.Reader.<init>(Reader.java:69)
	at app.service.geoip.LocalFilesystemGeoipDataProvider.loadDatabaseReader(LocalFilesystemGeoipDataProvider.java:72)

A quick search of the error message reveals that due to some old standards ByteBuffer size is limited to 2gb.

Is there any workaround or fix for this or is the reader simply unusable with a larger database?

nonetallt avatar Feb 01 '24 17:02 nonetallt

Unfortunately, I don't think there is a quick fix to this. If I recall correctly, it is also a limitation of Java's memory mapping. We would probably need to store an array of ByteBuffers and have multiple memory maps as well.

oschwald avatar Feb 01 '24 17:02 oschwald

Unfortunately, I don't think there is a quick fix to this. If I recall correctly, it is also a limitation of Java's memory mapping. We would probably need to store an array of ByteBuffers and have multiple memory maps as well.

Any change you would consider changing the internals to account for this limitation? Understandably this might not be something that can be fixed with just a snap of the fingers but just how large of a refactor are we talking about here?

nonetallt avatar Feb 01 '24 18:02 nonetallt

Given that all the MaxMind databases are well under this limit, it seems unlikely that we would implement this ourselves in the near future. The change will be at least somewhat invasive and I suspect it will harm the performance for smaller databases. We would consider merging a PR to address the problem if the impact on existing users was minimal.

In terms of how large of a refactor this would be, I suspect you would need to modify BufferHolder significantly and you would need to replace the use of ByteBuffer throughout the code with a similar abstraction that can handle more than 2 GB.

oschwald avatar Feb 01 '24 18:02 oschwald

Alright, thank you for the prompt answer. I don't think I currently have the resources to fix it myself given my general lack of knowledge considering the library's internals.

If you are concerned for the performance, it would probably make sense to either have 2 different memory handling implementations or a whole separate reader class.

nonetallt avatar Feb 01 '24 18:02 nonetallt