strfry icon indicating copy to clipboard operation
strfry copied to clipboard

MDB_MAP_FULL: Environment mapsize limit reached

Open IngwiePhoenix opened this issue 2 years ago • 5 comments

Hello!

Well, this had to happen eventually I guess. First, this is the error I see:

[Writer          ] ERR| Error writing 5 events: mdb_put: MDB_MAP_FULL: Environment mapsize limit reached

According to this issue ( https://github.com/zevv/duc/issues/163 ) this would have to do with mdb_env_set_mapsize(...), of which I couldn't find a direct call in the source code (granted, I never worked with the lmdb API).

This is the size of my DB:

root@birb:/srv/strfry/db# du -ah
4.1G    ./data.mdb
4.0K    ./lock.mdb
4.1G    .
root@birb:/srv/strfry/db# ls -l
total 4194320
drwxrwxr-x 2 strfry strfry       4096 Jul 28 23:19 .
drwxr-xr-x 3 strfry strfry       4096 Jul 29 08:34 ..
-rw-rw-r-- 1 strfry strfry 4294963200 Aug 20 19:19 data.mdb
-rw-rw-r-- 1 strfry strfry      16576 Aug 24 11:21 lock.mdb

Any idea what I can do here? This 4.1G reminds me a lot of the limitations in FAT32. I am running on an arm64 maschine and compiled strfry from source:

root@birb:/opt/strfry# ./strfry --version
strfry 0.9.3-4-gab03a57
root@birb:/opt/strfry# file strfry
strfry: ELF 64-bit LSB pie executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=ecae663aca9770ef9b7f60b058e01177f81eaea2, for GNU/Linux 3.7.0, with debug_info, not stripped
root@birb:/opt/strfry# uname -a
Linux birb 5.15.0-78-generic #85-Ubuntu SMP Fri Jul 7 15:29:30 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
root@birb:/opt/strfry# cat /proc/cpuinfo
processor       : 0
BogoMIPS        : 50.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x3
CPU part        : 0xd0c
CPU revision    : 1

processor       : 1
BogoMIPS        : 50.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x3
CPU part        : 0xd0c
CPU revision    : 1

processor       : 2
BogoMIPS        : 50.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x3
CPU part        : 0xd0c
CPU revision    : 1

processor       : 3
BogoMIPS        : 50.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x3
CPU part        : 0xd0c
CPU revision    : 1

Thanks and kind regards,

Ingwie

IngwiePhoenix avatar Aug 24 '23 11:08 IngwiePhoenix

Dev should change to libmdbx, instead of this one.

Is this database loaded into memory at some point or is it saved on disk only?

theAkito avatar Aug 25 '23 01:08 theAkito

It's on disk only - I can tell because otherwise my memory would be blowing up by now - or the kernel would be going after all kinds of processes or just kill strfry for OOM. Since I don't see any such behaviour, I believe this is fine.

However, this did get me thinking. Here's the SystemD unit I use:

[Unit]
Description=strfry relay service

[Service]
ExecStart=/opt/strfry/strfry relay
WorkingDirectory=/srv/strfry
ReadWritePaths=/srv/strfry
User=strfry
Group=strfry
Restart=on-failure
RestartSec=5
ProtectHome=yes
NoNewPrivileges=yes
ProtectSystem=full
LimitCORE=1000000000

[Install]
WantedBy=multi-user.target

I forgot why it has LimitCORE in there. Maybe there is a limit I have to adjust for this to work?

These seem to be my defaults.

root@birb:~# ulimit -a
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) 0
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 30385
max locked memory           (kbytes, -l) 990456
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1024
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) 8192
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 30385
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited

EDIT: The limits used as by the process, right now.

root@birb:~# cat /proc/471927/limits
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        1000000000           1000000000           bytes
Max resident set          unlimited            unlimited            bytes
Max processes             30385                30385                processes
Max open files            1024                 524288               files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       30385                30385                signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us

IngwiePhoenix avatar Aug 25 '23 01:08 IngwiePhoenix

I found the culprit. I feel a little dumb for not realizing it sooner.

Snippet:

dbParams {
    # Maximum number of threads/processes that can simultaneously have LMDB transactions open (restart required)
    maxreaders = 256

    # Size of mmap() to use when loading LMDB (default is 10TB, does *not* correspond to disk-space used) (restart required)
    mapsize = 4294967296
}

After reading more about the error, it turns out that it's common for the mapsize to be set manually - which is why it's an option in the config.

Well, I upped my value from 4GB to 6GB, restarted and...

Aug 25 03:25:59 birb strfry[638379]: 2023-08-25 03:25:59.832 (  13.316s) [Websocket       ]INFO| [27] Connect from 172.71.142.159 compression=N sliding=N
Aug 25 03:26:00 birb strfry[638379]: 2023-08-25 03:26:00.041 (  13.525s) [Writer          ]INFO| Inserted event. id=22d9bdc6738d927445543aba8e7b32bb13894468d6a8c89966beda3ef7aca2c5 levId=1862459
Aug 25 03:26:00 birb strfry[638379]: 2023-08-25 03:26:00.041 (  13.525s) [Writer          ]INFO| Inserted event. id=8e4811aef672ff42b7c423a14920446f6bcfb8c97cbc0abad4f8e893a7b1372d levId=1862460
Aug 25 03:26:00 birb strfry[638379]: 2023-08-25 03:26:00.041 (  13.525s) [Writer          ]INFO| Inserted event. id=a27a1a91630d864790248171b4428710e60821dcfe5cb85bcd894a3c3a34dc71 levId=1862461
Aug 25 03:26:00 birb strfry[638379]: 2023-08-25 03:26:00.041 (  13.525s) [Writer          ]INFO| Inserted event. id=a7c77335c6f06e2f6692bc0db33343a06f708b9553c455af8f0133da31fe33fa levId=1862462
Aug 25 03:26:00 birb strfry[638379]: 2023-08-25 03:26:00.041 (  13.525s) [Writer          ]INFO| Inserted event. id=88a5034de4fa6e3c68dd9ab6c8e56dbf72ff33578b86a0c5e80bf7d14b86a165 levId=1862463
Aug 25 03:26:00 birb strfry[638379]: 2023-08-25 03:26:00.215 (  13.699s) [Websocket       ]INFO| [27] Disconnect from 172.71.142.159 (0/-) UP: 405b (0.0% compressed) DN: 2.93K (0.0% compressed)
Aug 25 03:26:16 birb strfry[638379]: 2023-08-25 03:26:16.246 (  29.729s) [Websocket       ]INFO| ...

I will leave this issue open; but fact of the matter is, I need to find a way to slim this down. 4GB of basically just text is an awful lot o.o

IngwiePhoenix avatar Aug 25 '23 03:08 IngwiePhoenix

What will you do, when it hits 6GB even quicker? 😉

theAkito avatar Aug 25 '23 11:08 theAkito

Hi! The default value for this is 10 TB. Note that this does not correspond to the size used on disk, at least not on linux, so on a 64-bit system there is no reason not to set it to a very high value. This is only the size of the virtual memory mapping that will be reserved by strfry process.

hoytech avatar Aug 28 '23 08:08 hoytech