scikit-learn-intelex
scikit-learn-intelex copied to clipboard
Memory leak DBSCAN
(I am using a similar bug: https://github.com/intel/scikit-learn-intelex/issues/843 as a reference/template, props to DavidCohen2)
Describe the bug I found a memory leak in the intelex implementation of sklearn.cluster.DBSCAN. The regular implementation in scikit-learn does not show this behavior.
To Reproduce
use_intelex = True if use_intelex: from sklearnex import patch_sklearn patch_sklearn() import os, psutil process = psutil.Process(os.getpid()) from sklearn.cluster import DBSCAN import numpy as np X = np.array([[1.4191e+01, 1.4206e+01, 1.4754e+01, 1.5188e+01, 1.3509e+01, 1.4500e+01 \ , 1.5691e+01, 1.5997e+01, 1.6000e+01, 1.6009e+01, 1.6003e+01, 1.6003e+01 \ , 1.6203e+01, 1.5997e+01, 1.5709e+01, 1.5694e+01, 1.5200e+01, 1.4688e+01 \ , 1.5491e+01, 1.6200e+01, 1.6206e+01, 1.4494e+01, 1.4494e+01, 1.4700e+01 \ , 1.4254e+01, 1.4191e+01, 1.4009e+01, 1.3554e+01, 1.2000e+01, 1.2554e+01 \ , 1.1509e+01, 1.1003e+01, 1.1488e+01, 1.1254e+01, 1.0709e+01, 1.0994e+01 \ , 9.9970e+00, 8.6940e+00, 6.9910e+00, 6.5090e+00, 5.9940e+00, 3.6940e+00 \ , 2.4880e+00, 0.0000e+00, 9.0000e-03], \ [1.4206e+01, 1.4754e+01, 1.5188e+01, 1.3509e+01, 1.4500e+01, 1.5691e+01 \ , 1.5997e+01, 1.6000e+01, 1.6009e+01, 1.6003e+01, 1.6003e+01, 1.6203e+01 \ , 1.5997e+01, 1.5709e+01, 1.5694e+01, 1.5200e+01, 1.4688e+01, 1.5491e+01 \ , 1.6200e+01, 1.6206e+01, 1.4494e+01, 1.4494e+01, 1.4700e+01, 1.4254e+01 \ , 1.4191e+01, 1.4009e+01, 1.3554e+01, 1.2000e+01, 1.2554e+01, 1.1509e+01 \ , 1.1003e+01, 1.1488e+01, 1.1254e+01, 1.0709e+01, 1.0994e+01, 9.9970e+00 \ , 8.6940e+00, 6.9910e+00, 6.5090e+00, 5.9940e+00, 3.6940e+00, 2.4880e+00 \ , 0.0000e+00, 9.0000e-03, 1.7090e+00] \ ]) for i in range(1000000): results = DBSCAN().fit(X, y = None) if (i % 100000) == 0: print( "Iteration Number {}: {:.2f} GB memory used".format(i, psutil.virtual_memory()[3] / 1024 ** 3))
Expected behavior Memory usage should stay constant over iterations.
Output/Screenshots
With use_intelex = True
:
Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex) Iteration Number 0: 4.53 GB memory used Iteration Number 100000: 4.60 GB memory used Iteration Number 200000: 4.63 GB memory used Iteration Number 300000: 4.72 GB memory used Iteration Number 400000: 4.78 GB memory used Iteration Number 500000: 4.94 GB memory used Iteration Number 600000: 5.16 GB memory used Iteration Number 700000: 5.20 GB memory used Iteration Number 800000: 5.32 GB memory used Iteration Number 900000: 5.56 GB memory used
Environment:
- OS:
$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 22.04.4 LTS Release: 22.04 Codename: jammy
- Compiler:
$ python3 --version Python 3.10.12
- Version:
$ pip3 install intelex -U Requirement already satisfied: intelex in /home/user/intelex/lib/python3.10/site-packages (0.0.30)
- CPU:
$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: Intel(R) Xeon(R) CPU E3-1535M v5 @ 2.90GHz CPU family: 6 Model: 94 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 3 CPU max MHz: 3800.0000 CPU min MHz: 800.0000 BogoMIPS: 5799.77 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mc a cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_ tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cp l vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid ss e4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_f ault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_sh adow flexpriority ept vpid ept_ad fsgsbase tsc_adjust b mi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clf lushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp vnmi md_clear flush_l1d arch_capabilities Virtualization features: Virtualization: VT-x Caches (sum of all): L1d: 128 KiB (4 instances) L1i: 128 KiB (4 instances) L2: 1 MiB (4 instances) L3: 8 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-7 Vulnerabilities: Gather data sampling: Vulnerable: No microcode Itlb multihit: KVM: Mitigation: VMX disabled L1tf: Mitigation; PTE Inversion; VMX conditional cache flushe s, SMT vulnerable Mds: Mitigation; Clear CPU buffers; SMT vulnerable Meltdown: Mitigation; PTI Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable Retbleed: Mitigation; IBRS Spec rstack overflow: Not affected Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected Srbds: Mitigation; Microcode Tsx async abort: Mitigation; TSX disabled
@jpf18 thank you for the report! Could you please clarify the daal4py/scikit-learn-intelex version in your env?
$ pip3 show daal4py Name: daal4py Version: 2024.2.0 Summary: daal4py is a Convenient Python API to the Intel® oneAPI Data Analytics Library (oneDAL) Home-page: https://github.com/IntelPython/daal4py Author: Intel Corporation Author-email: [email protected] License: Apache v2.0 Location: /home/user/intelex/lib/python3.10/site-packages Requires: daal, numpy Required-by: scikit-learn-intelex
After updating the package to the latest:
$ pip3 show daal4py Name: daal4py Version: 2024.3.0 Summary: daal4py is a Convenient Python API to the Intel® oneAPI Data Analytics Library (oneDAL) Home-page: https://github.com/IntelPython/daal4py Author: Intel Corporation Author-email: [email protected] License: Apache v2.0 Location: /home/user/intelex/lib/python3.10/site-packages Requires: daal, numpy Required-by: scikit-learn-intelex
I get the following when running the above script:
$ python3 dbscanMemleak.py Iteration Number 0: 3.23 GB memory used Iteration Number 100000: 3.23 GB memory used Iteration Number 200000: 3.23 GB memory used Iteration Number 300000: 3.25 GB memory used Iteration Number 400000: 3.39 GB memory used Iteration Number 500000: 3.39 GB memory used Iteration Number 600000: 3.39 GB memory used Iteration Number 700000: 3.39 GB memory used Iteration Number 800000: 3.41 GB memory used Iteration Number 900000: 3.42 GB memory used
Memory use still grows over iterations, but at a significantly lower rate. (Sufficiently enough for my workload actually)
Thank you for pointing that out. The memory leak should be addressed by the PR https://github.com/oneapi-src/oneDAL/pull/2811. Please let us know if you face any other issues. Will be closing the issue once the PR is merged.