nh3 icon indicating copy to clipboard operation
nh3 copied to clipboard

Memory leak

Open davidmanzanares opened this issue 7 months ago • 0 comments

Hi,

I think I've found a memory leak.

This example reproduces it:

import requests
import nh3
html = requests.get("https://search.brave.com/").text

for _ in range(30_000):
    nh3.clean(html)

If you run that along any tool like htop you should see that the memory of the process grows continually and without any apparent bound.

I've tried to find the root cause. But I'm not really sure of my findings, and they seem pretty weird.

Bisecting nh3 with the above example gave me this:

# b5074b186b813313b258a7c97871bb2d9fc0eaa7 is the first bad commit
# commit b5074b186b813313b258a7c97871bb2d9fc0eaa7
# Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# Date:   Mon Apr 22 16:12:37 2024 +0800

#     Bump pyo3 from 0.21.1 to 0.21.2 (#43)
    
#     Bumps [pyo3](https://github.com/pyo3/pyo3) from 0.21.1 to 0.21.2.
#     - [Release notes](https://github.com/pyo3/pyo3/releases)
#     - [Changelog](https://github.com/PyO3/pyo3/blob/main/CHANGELOG.md)
#     - [Commits](https://github.com/pyo3/pyo3/compare/v0.21.1...v0.21.2)
    
#     ---
#     updated-dependencies:
#     - dependency-name: pyo3
#       dependency-type: direct:production
#       update-type: version-update:semver-patch
#     ...
    
#     Signed-off-by: dependabot[bot] <[email protected]>
#     Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

#  Cargo.lock | 20 ++++++++++----------
#  Cargo.toml |  2 +-

Using memray also pointed to pyo3:

python3 -m memray run --native -f -o output2.bin nh.py
python3 -m memray flamegraph -f output2.bin

image

Thank you!

davidmanzanares avatar Jul 15 '24 07:07 davidmanzanares