scikit-learn-intelex icon indicating copy to clipboard operation
scikit-learn-intelex copied to clipboard

Memory leak DBSCAN

Open jpf18 opened this issue 10 months ago • 2 comments

(I am using a similar bug: https://github.com/intel/scikit-learn-intelex/issues/843 as a reference/template, props to DavidCohen2)

Describe the bug I found a memory leak in the intelex implementation of sklearn.cluster.DBSCAN. The regular implementation in scikit-learn does not show this behavior.

To Reproduce

use_intelex = True

if use_intelex:
    from sklearnex import patch_sklearn
    patch_sklearn()

import os, psutil
process = psutil.Process(os.getpid())

from sklearn.cluster import DBSCAN
import numpy as np

X = np.array([[1.4191e+01, 1.4206e+01, 1.4754e+01, 1.5188e+01, 1.3509e+01, 1.4500e+01 \
, 1.5691e+01, 1.5997e+01, 1.6000e+01, 1.6009e+01, 1.6003e+01, 1.6003e+01 \
, 1.6203e+01, 1.5997e+01, 1.5709e+01, 1.5694e+01, 1.5200e+01, 1.4688e+01 \
, 1.5491e+01, 1.6200e+01, 1.6206e+01, 1.4494e+01, 1.4494e+01, 1.4700e+01 \
, 1.4254e+01, 1.4191e+01, 1.4009e+01, 1.3554e+01, 1.2000e+01, 1.2554e+01 \
, 1.1509e+01, 1.1003e+01, 1.1488e+01, 1.1254e+01, 1.0709e+01, 1.0994e+01 \
, 9.9970e+00, 8.6940e+00, 6.9910e+00, 6.5090e+00, 5.9940e+00, 3.6940e+00 \
, 2.4880e+00, 0.0000e+00, 9.0000e-03], \
[1.4206e+01, 1.4754e+01, 1.5188e+01, 1.3509e+01, 1.4500e+01, 1.5691e+01 \
, 1.5997e+01, 1.6000e+01, 1.6009e+01, 1.6003e+01, 1.6003e+01, 1.6203e+01 \
, 1.5997e+01, 1.5709e+01, 1.5694e+01, 1.5200e+01, 1.4688e+01, 1.5491e+01 \
, 1.6200e+01, 1.6206e+01, 1.4494e+01, 1.4494e+01, 1.4700e+01, 1.4254e+01 \
, 1.4191e+01, 1.4009e+01, 1.3554e+01, 1.2000e+01, 1.2554e+01, 1.1509e+01 \
, 1.1003e+01, 1.1488e+01, 1.1254e+01, 1.0709e+01, 1.0994e+01, 9.9970e+00 \
, 8.6940e+00, 6.9910e+00, 6.5090e+00, 5.9940e+00, 3.6940e+00, 2.4880e+00 \
, 0.0000e+00, 9.0000e-03, 1.7090e+00] \
])

for i in range(1000000):
    results = DBSCAN().fit(X, y = None)
    if (i % 100000) == 0:
        print( "Iteration Number {}: {:.2f} GB memory used".format(i, psutil.virtual_memory()[3] / 1024 ** 3))

Expected behavior Memory usage should stay constant over iterations.

Output/Screenshots With use_intelex = True:

Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)
Iteration Number 0: 4.53 GB memory used
Iteration Number 100000: 4.60 GB memory used
Iteration Number 200000: 4.63 GB memory used
Iteration Number 300000: 4.72 GB memory used
Iteration Number 400000: 4.78 GB memory used
Iteration Number 500000: 4.94 GB memory used
Iteration Number 600000: 5.16 GB memory used
Iteration Number 700000: 5.20 GB memory used
Iteration Number 800000: 5.32 GB memory used
Iteration Number 900000: 5.56 GB memory used

Environment:

  • OS:
$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.4 LTS
Release:	22.04
Codename:	jammy
  • Compiler:
$ python3 --version
Python 3.10.12
  • Version:
$ pip3 install intelex -U
Requirement already satisfied: intelex in /home/user/intelex/lib/python3.10/site-packages (0.0.30)
  • CPU:
$ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  8
  On-line CPU(s) list:   0-7
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Xeon(R) CPU E3-1535M v5 @ 2.90GHz
    CPU family:          6
    Model:               94
    Thread(s) per core:  2
    Core(s) per socket:  4
    Socket(s):           1
    Stepping:            3
    CPU max MHz:         3800.0000
    CPU min MHz:         800.0000
    BogoMIPS:            5799.77
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mc
                         a cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss 
                         ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art
                          arch_perfmon pebs bts rep_good nopl xtopology nonstop_
                         tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cp
                         l vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid ss
                         e4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes 
                         xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_f
                         ault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_sh
                         adow flexpriority ept vpid ept_ad fsgsbase tsc_adjust b
                         mi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clf
                         lushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm 
                         ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp 
                         vnmi md_clear flush_l1d arch_capabilities
Virtualization features: 
  Virtualization:        VT-x
Caches (sum of all):     
  L1d:                   128 KiB (4 instances)
  L1i:                   128 KiB (4 instances)
  L2:                    1 MiB (4 instances)
  L3:                    8 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-7
Vulnerabilities:         
  Gather data sampling:  Vulnerable: No microcode
  Itlb multihit:         KVM: Mitigation: VMX disabled
  L1tf:                  Mitigation; PTE Inversion; VMX conditional cache flushe
                         s, SMT vulnerable
  Mds:                   Mitigation; Clear CPU buffers; SMT vulnerable
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Mitigation; Clear CPU buffers; SMT vulnerable
  Retbleed:              Mitigation; IBRS
  Spec rstack overflow:  Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer
                          sanitization
  Spectre v2:            Mitigation; IBRS, IBPB conditional, STIBP conditional, 
                         RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Mitigation; Microcode
  Tsx async abort:       Mitigation; TSX disabled

jpf18 avatar Apr 23 '24 21:04 jpf18

@jpf18 thank you for the report! Could you please clarify the daal4py/scikit-learn-intelex version in your env?

samir-nasibli avatar Apr 24 '24 11:04 samir-nasibli

$ pip3 show daal4py
Name: daal4py
Version: 2024.2.0
Summary: daal4py is a Convenient Python API to the Intel® oneAPI Data Analytics Library (oneDAL)
Home-page: https://github.com/IntelPython/daal4py
Author: Intel Corporation
Author-email: [email protected]
License: Apache v2.0
Location: /home/user/intelex/lib/python3.10/site-packages
Requires: daal, numpy
Required-by: scikit-learn-intelex

After updating the package to the latest:

$ pip3 show daal4py
Name: daal4py
Version: 2024.3.0
Summary: daal4py is a Convenient Python API to the Intel® oneAPI Data Analytics Library (oneDAL)
Home-page: https://github.com/IntelPython/daal4py
Author: Intel Corporation
Author-email: [email protected]
License: Apache v2.0
Location: /home/user/intelex/lib/python3.10/site-packages
Requires: daal, numpy
Required-by: scikit-learn-intelex

I get the following when running the above script:

$ python3 dbscanMemleak.py 
Iteration Number 0: 3.23 GB memory used
Iteration Number 100000: 3.23 GB memory used
Iteration Number 200000: 3.23 GB memory used
Iteration Number 300000: 3.25 GB memory used
Iteration Number 400000: 3.39 GB memory used
Iteration Number 500000: 3.39 GB memory used
Iteration Number 600000: 3.39 GB memory used
Iteration Number 700000: 3.39 GB memory used
Iteration Number 800000: 3.41 GB memory used
Iteration Number 900000: 3.42 GB memory used

Memory use still grows over iterations, but at a significantly lower rate. (Sufficiently enough for my workload actually)

jpf18 avatar Apr 24 '24 15:04 jpf18

Thank you for pointing that out. The memory leak should be addressed by the PR https://github.com/oneapi-src/oneDAL/pull/2811. Please let us know if you face any other issues. Will be closing the issue once the PR is merged.

md-shafiul-alam avatar Jun 07 '24 13:06 md-shafiul-alam