naiveproxy icon indicating copy to clipboard operation
naiveproxy copied to clipboard

v128.0.6613.40-1 memory usage is too large

Open zhangbo8418 opened this issue 1 year ago • 13 comments

v128.0.6613.40-1 memory usage is too large

zhangbo8418 avatar Aug 22 '24 08:08 zhangbo8418

More detailed information is required.

silver716 avatar Aug 30 '24 06:08 silver716

More detailed information is required.

I installed a transparent proxy on the router. After a few days, the memory exploded. It felt like there was a memory leak. . . .

zhangbo8418 avatar Aug 30 '24 13:08 zhangbo8418

More detailed information is required.

It may be caused by this optimization https://github.com/klzgrad/naiveproxy/commit/d6391623e54c65d415b31b03b26712e959a115f5. Because I am a router, the memory is very small, and when multiple devices pass through the router, the memory usage of naiveproxy increases dramatically. It would be great if I could customize it through the configuration file. . . .

zhangbo8418 avatar Sep 04 '24 00:09 zhangbo8418

PS you should use curl later than 8.1.0.

curl 7.52.1 (mipsel-unknown-linux-gnu) libcurl/7.52.1 OpenSSL/1.0.2u zlib/1.2.8 libidn2/2.0.5 libpsl/0.17.0 (+libidn2/0.16) libssh2/1.8.0 nghttp2/1.18.1 librtmp/2.3 Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy PSL

zhangbo8418 avatar Sep 04 '24 07:09 zhangbo8418

openwrt packages: https://downloads.openwrt.org/releases/23.05.4/packages

such as arm_cortex-a9:

https://downloads.openwrt.org/releases/23.05.4/packages/arm_cortex-a9/packages/curl_8.7.1-r1_arm_cortex-a9.ipk https://downloads.openwrt.org/releases/23.05.4/packages/arm_cortex-a9/packages/libcurl4_8.7.1-r1_arm_cortex-a9.ipk https://downloads.openwrt.org/releases/23.05.4/packages/arm_cortex-a9/packages/libcurl-gnutls4_8.7.1-1_arm_cortex-a9.ipk

But my router is ubnt er-x, the CPU is mipsel, and the naive version I use is a statically compiled version. 20240907213808

zhangbo8418 avatar Sep 07 '24 13:09 zhangbo8418

@zhangbo8418 either you should be able to compile naiveproxy without the patch you specified or you should be able to compile latest curl targetting on your device (recommend static build). If you are using openwrt, latest curl packages are available.

The memory will not be freed it will keep increasing. 20240907214947

zhangbo8418 avatar Sep 07 '24 13:09 zhangbo8418

https://github.com/klzgrad/naiveproxy/releases/tag/v128.0.6613.40-2 I added --http2-recv-window so you can use --http2-recv-window=6291456 to revert to previous window sizes. But I doubt memory use increase is caused by recv window size.

You should post a comparison of RES usage for v127, v128-1, v128-2 with --http2-recv-window=6291456.

klzgrad avatar Sep 08 '24 03:09 klzgrad

v127, v128-1, v128-2

Thanks, I'll try it right away. In fact, the memory of v127, v128-1, and v128-2 will eventually explode, because the client will not release the memory if it does not pass naive, and the memory will only continue to increase. It is just that the growth is very fast after v127-3. Within 24 hours here It exploded and caused the router to crash. The growth of v127-2 was slower, about 4-5 days (of course, it is also related to the number of clients passing naive).

zhangbo8418 avatar Sep 08 '24 03:09 zhangbo8418

This problem is common for direct terminal devices, but since the router is powered on 7*24, it becomes a more serious problem if the memory is not released.

zhangbo8418 avatar Sep 08 '24 04:09 zhangbo8418

https://github.com/klzgrad/naiveproxy/releases/tag/v128.0.6613.40-2 I added --http2-recv-window so you can use --http2-recv-window=6291456 to revert to previous window sizes. But I doubt memory use increase is caused by recv window size.

You should post a comparison of RES usage for v127, v128-1, v128-2 with --http2-recv-window=6291456.

v128-2 just like this, the memory has been increasing slowly, but it still will not release the memory.

https://github.com/user-attachments/assets/abf5452a-afd9-45ad-b098-0d40a82d8b8e

zhangbo8418 avatar Sep 09 '24 04:09 zhangbo8418

And and orbit https://github.com/google/orbit is one of memory profilers I use to profile native C++ program. Hers is the document https://github.com/google/orbit/blob/main/documentation/DOCUMENTATION.md. You can start it and connect it remotely to see how your program allocs the memory.

And and and It will save your life if you can reproduce it in local linux system with glibc than your ubnt box.

Actually, this issue has minimal impact on direct terminal devices, but since routers operate 24/7, it becomes a severe problem if the memory cannot be released.

zhangbo8418 avatar Sep 16 '24 14:09 zhangbo8418

I found the reason and the solution. I was using the openwrt-mipsel_24kc-static version. and I needed to modify the github workflow files build.yml. Find the following section :

  • arch: mipsel_24kc-static openwrt: "target=ramips subtarget=rt305x" target_cpu: mipsel extra: 'mips_arch_variant="r2" mips_float_abi="soft" build_static=true no_madvise_syscall=true'

and change no_madvise_syscall=true to no_madvise_syscall=false, then build it. I guess the author may have set no_madvise_syscall=true to maintain maximum compatibility, which results in RAM not being automatically adjusted, causing memory usage to continuously increase without being released. If your kernel version is later than Linux 2.2, in most cases it supports the madvise syscall. In this case, it's recommended to set no_madvise_syscall=false, which allows the system to automatically adjust naive's memory usage. If your system does not support the madvise_syscall function, use the following command: while true; do /root/naive --listen=redir://0.0.0.0:1080 --proxy=https://xxxxxx:[email protected]; done, which will automatically restart naive after it exits due to an error.

hisuwj avatar Oct 17 '24 10:10 hisuwj

I found the reason and the solution. I was using the openwrt-mipsel_24kc-static version. and I needed to modify the github workflow files build.yml. Find the following section :

  • arch: mipsel_24kc-static openwrt: "target=ramips subtarget=rt305x" target_cpu: mipsel extra: 'mips_arch_variant="r2" mips_float_abi="soft" build_static=true no_madvise_syscall=true'

and change no_madvise_syscall=true to no_madvise_syscall=false, then build it. I guess the author may have set no_madvise_syscall=true to maintain maximum compatibility, which results in RAM not being automatically adjusted, causing memory usage to continuously increase without being released. If your kernel version is later than Linux 2.2, in most cases it supports the madvise syscall. In this case, it's recommended to set no_madvise_syscall=false, which allows the system to automatically adjust naive's memory usage. If your system does not support the madvise_syscall function, use the following command: while true; do /root/naive --listen=redir://0.0.0.0:1080 --proxy=https://xxxxxx:[email protected]; done, which will automatically restart naive after it exits due to an error.

Although my EdgeRouter kernel is 4.14.54, it cannot be used after I compiled it with no_madvise_syscall=true, alas. . . No solution? 20241018120204

zhangbo8418 avatar Oct 18 '24 04:10 zhangbo8418

I found the reason and the solution. I was using the openwrt-mipsel_24kc-static version. and I needed to modify the github workflow files build.yml. Find the following section :

  • arch: mipsel_24kc-static openwrt: "target=ramips subtarget=rt305x" target_cpu: mipsel extra: 'mips_arch_variant="r2" mips_float_abi="soft" build_static=true no_madvise_syscall=true'

and change no_madvise_syscall=true to no_madvise_syscall=false, then build it. I guess the author may have set no_madvise_syscall=true to maintain maximum compatibility, which results in RAM not being automatically adjusted, causing memory usage to continuously increase without being released. If your kernel version is later than Linux 2.2, in most cases it supports the madvise syscall. In this case, it's recommended to set no_madvise_syscall=false, which allows the system to automatically adjust naive's memory usage. If your system does not support the madvise_syscall function, use the following command: while true; do /root/naive --listen=redir://0.0.0.0:1080 --proxy=https://xxxxxx:[email protected]; done, which will automatically restart naive after it exits due to an error.

Although my EdgeRouter kernel is 4.14.54, it cannot be used after I compiled it with no_madvise_syscall=true, alas. . . No solution? 20241018120204

The CPU I am using is the same as yours; it is the MT7621. The workflow file build.yml on GitHub should be modified to:

  • arch: mipsel_24kc-static openwrt: "target=ramips subtarget=rt305x" target_cpu: mipsel extra: 'mips_arch_variant="r2" mips_float_abi="soft" build_static=true no_madvise_syscall=false' 222222222

hisuwj avatar Oct 18 '24 07:10 hisuwj

no_madvise_syscall=false

The compiled binary cannot be executed after changing it to no_madvise_syscall=false. I have tried it.

zhangbo8418 avatar Oct 18 '24 09:10 zhangbo8418

If you build with no_madvise_syscall=false, it will call madvise. And if your kernel does not have madvise, it will crash.

What is this EdgeRouter kernel 4.14.54, I have no idea if it has madvise.

You need to provide your detail as to why your kernel does not support madvise, or there is no way to find the right solution.

klzgrad avatar Oct 18 '24 10:10 klzgrad

no_madvise_syscall=false

The compiled binary cannot be executed after changing it to no_madvise_syscall=false. I have tried it.

That means your system kernel does not support the madvise_syscall, so you can only use the original static version released by the author (no_madvise_syscall=true).

As a temporary solution, you can manually limit the maximum memory usage of naive using the ulimit command to prevent it from consuming too much memory and affecting other programs. You can also use while true; do; done to make naive automatically restart after crashing when it reaches the memory limit.

The specific method is as follows. This is my configuration, and you can adjust it based on your machine's actual memory.

vi naive.sh

ulimit -v 120000 # Adjust according to your system's actual memory ulimit -m 90000 # Adjust according to your system's actual memory while true; do /root/np/naive /root/np/config.json done

chmod 755 ./naive.sh ./naive.sh &

hisuwj avatar Oct 18 '24 10:10 hisuwj

no_madvise_syscall=false

The compiled binary cannot be executed after changing it to no_madvise_syscall=false. I have tried it.

That means your system kernel does not support the madvise_syscall, so you can only use the original static version released by the author (no_madvise_syscall=true).

As a temporary solution, you can manually limit the maximum memory usage of naive using the ulimit command to prevent it from consuming too much memory and affecting other programs. You can also use while true; do; done to make naive automatically restart after crashing when it reaches the memory limit.

The specific method is as follows. This is my configuration, and you can adjust it based on your machine's actual memory.

vi naive.sh

ulimit -v 120000 # Adjust according to your system's actual memory

ulimit -m 90000 # Adjust according to your system's actual memory while true; do /root/np/naive /root/np/config.json done chmod 755 ./naive.sh ./naive.sh &

ulimit modifies virtual memory and has no use for physical memory. This is obviously a memory leak BUG. It will not release memory because the client disconnects.

zhangbo8418 avatar Oct 18 '24 11:10 zhangbo8418

If you build with no_madvise_syscall=false, it will call madvise. And if your kernel does not have madvise, it will crash.

What is this EdgeRouter kernel 4.14.54, I have no idea if it has madvise.

You need to provide your detail as to why your kernel does not support madvise, or there is no way to find the right solution.

core-naive-29778-5.zip

zhangbo8418 avatar Oct 18 '24 16:10 zhangbo8418

config ADVISE_SYSCALLS
        bool "Enable madvise/fadvise syscalls" if EXPERT
        default y
        help
          This option enables the madvise and fadvise syscalls, used by
          applications to advise the kernel about their future memory or file
          usage, improving performance. If building an embedded system where no
          applications use these syscalls, you can disable this option to save
          space.

It's possible the kernel builder saw this description and thought disabling this can save space.

OpenWrt only enabled this after https://github.com/openwrt/openwrt/commit/56342ee2bcbf9bf8918a01045471c7bb7faa1596 4.9 in 2017. https://patchwork.ozlabs.org/project/lede/patch/[email protected]/

It seems PartitionAlloc requires madvise to manage memory and without madvise it leaks memory. This means PartitionAlloc is fundamentally incompatible with kernel with CONFIG_ADVISE_SYSCALLS disabled.

You can check if your kernel is built with CONFIG_ADVISE_SYSCALLS in /proc/config.gz or /boot/config-*.

klzgrad avatar Oct 19 '24 00:10 klzgrad

/proc/config.gz or /boot/config-*.

There is no /proc/config.gz or /boot/config-*., it is probably not packaged.

zhangbo8418 avatar Oct 19 '24 01:10 zhangbo8418

Build alternative Cons
use_partition_alloc=false use_allocator_shim=false Avoidable memory fragmentation, loss of performance and memory safety on kernels with CONFIG_ADVISE_SYSCALLS
use_partition_alloc=true no_madvise_syscall=false Crash on kernels without CONFIG_ADVISE_SYSCALLS
use_partition_alloc=true no_madvise_syscall=true Memory leaks on kernels without CONFIG_ADVISE_SYSCALLS
Multiple build variants Too many builds, confusing

Now I choose use_partition_alloc=false use_allocator_shim=false to make the static builds compatible with kernels without CONFIG_ADVISE_SYSCALLS, at the cost of its memory inefficiency on kernels with CONFIG_ADVISE_SYSCALLS, without introducing too many build variants.

See if https://github.com/klzgrad/naiveproxy/releases/tag/v130.0.6723.40-3 fixes memory leaks reported here.

klzgrad avatar Oct 19 '24 04:10 klzgrad

Build alternative Cons use_partition_alloc=false use_allocator_shim=false Avoidable memory fragmentation, loss of performance and memory safety on kernels with CONFIG_ADVISE_SYSCALLS use_partition_alloc=true no_madvise_syscall=false Crash on kernels without CONFIG_ADVISE_SYSCALLS use_partition_alloc=true no_madvise_syscall=true Memory leaks on kernels without CONFIG_ADVISE_SYSCALLS Multiple build variants Too many builds, confusing Now I choose use_partition_alloc=false use_allocator_shim=false to make the static builds compatible with kernels without CONFIG_ADVISE_SYSCALLS, at the cost of its memory inefficiency on kernels with CONFIG_ADVISE_SYSCALLS, without introducing too many build variants.

See if https://github.com/klzgrad/naiveproxy/releases/tag/v130.0.6723.40-3 fixes memory leaks reported here.

That's ok, at least the memory will be freed.

zhangbo8418 avatar Oct 19 '24 04:10 zhangbo8418

That is not ok, because people with normal kernels get worse performance from the static builds.

klzgrad avatar Oct 19 '24 08:10 klzgrad

That is not ok, because people with normal kernels get worse performance from the static builds.

My system kernel supports madvise. If I remove use_partition_alloc=false and use_allocator_shim=false from the GitHub workflow file, it shouldn't lead to memory inefficiency, right? madvise should still be callable, correct?

hisuwj avatar Oct 19 '24 08:10 hisuwj

If I remove

The point of this discussion is that the user should be able to download easily usable binaries without having to build from source.

The static builds exist because some users have legacy non-OpenWrt embedded distros that do not work with the musl dependency requirement.

What is your distro, and why do you need a static build? @hisuwj

klzgrad avatar Oct 19 '24 08:10 klzgrad

If I remove

The point of this discussion is that the user should be able to download easily usable binaries without having to build from source.

The static builds exist because some users have legacy non-OpenWrt embedded distros that do not work with the musl dependency requirement.

What is your distro, and why do you need a static build? @hisuwj

My system version is openwrt-21.02.7. The support for musl time64 started after the openwrt-22.03.0 version. Therefore, the versions prior to openwrt-21.02.7 do not support musl time64, so I can only choose the static version.

hisuwj avatar Oct 19 '24 08:10 hisuwj

If I remove

The point of this discussion is that the user should be able to download easily usable binaries without having to build from source. The static builds exist because some users have legacy non-OpenWrt embedded distros that do not work with the musl dependency requirement. What is your distro, and why do you need a static build? @hisuwj

My system version is openwrt-21.02.7. The support for musl time64 started after the openwrt-22.03.0 version. Therefore, the versions prior to openwrt-21.02.7 do not support musl time64, so I can only choose the static version.

The musl version in openwrt-21.02.7 is 1.1.24, and support for time64 began only after musl 1.2.0.

hisuwj avatar Oct 19 '24 09:10 hisuwj

That is not ok, because people with normal kernels get worse performance from the static builds.

The memory leak issue returns in v130.0.6723.40-4.

https://github.com/user-attachments/assets/ab6a0b44-1d26-4b30-8bfa-01f3ae532ef7

zhangbo8418 avatar Oct 20 '24 01:10 zhangbo8418

Ok, either of you have reported inaccurate information.

returns in v130.0.6723.40-4

What does returns mean, the leak does not reproduce in a previous version?

v130.0.6723.40-4 does not use madvise and still leaks. If true, this claim

set no_madvise_syscall=true to maintain maximum compatibility, which results in RAM not being automatically adjusted, causing memory usage to continuously increase without being released

is false.

klzgrad avatar Oct 20 '24 01:10 klzgrad