PyPardisoProject icon indicating copy to clipboard operation
PyPardisoProject copied to clipboard

Potential memory leak in using PyPardisoSolver

Open aatmdelissen opened this issue 1 year ago • 1 comments

The PyPardisoSolver object loads the MKL library on each instance and needs to be released manually by the user, using free_memory(everything=True). If this is not done, memory is leaked and increases for each new solver instance (see example below). I suspect the reason for increasing memory is that the MKL library is loaded each time __init__() is called.

A possible solution might be a loader function that loads the MKL library and stores it in a global variable inside pypardiso.

Example:

import psutil
import pypardiso
import numpy as np
import scipy.sparse as spsp

def test_memory_leak_pypardiso():
    np.random.seed(0)
    K = spsp.rand(1000, 1000, 0.04)
    K = K + K.T  # Make symmetric
    A = spsp.triu(K, format='coo')  # Only use upper part
    # Explicitly set zero diagonal entries, as this is better for Intel Pardiso
    zero_diag_entries, = np.where(A.diagonal() == 0)
    if len(zero_diag_entries) > 0:
        A.row = np.append(A.row, zero_diag_entries)
        A.col = np.append(A.col, zero_diag_entries)
        A.data = np.append(A.data, np.zeros_like(zero_diag_entries))
    A = A.tocsr()

    b = np.random.rand(A.shape[0])

    process = psutil.Process()
    mem, memprev = process.memory_info().rss, 0.0
    for i in range(20):
        solver = pypardiso.PyPardisoSolver(mtype=-2)
        solver.factorize(A)
        solver.solve(A, b)
        # solver.free_memory(everything=True)  # uncomment to release memory

        mem, memprev = process.memory_info().rss, mem
        print(f"mem = {mem}, delta = {mem - memprev}")  # in bytes

Output:

mem = 144121856, delta = 8441856
mem = 153714688, delta = 9592832
mem = 161259520, delta = 7544832
mem = 170823680, delta = 9564160
mem = 179695616, delta = 8871936
mem = 188497920, delta = 8802304
mem = 197685248, delta = 9187328
mem = 206245888, delta = 8560640
mem = 215117824, delta = 8871936
mem = 224055296, delta = 8937472
mem = 233066496, delta = 9011200
mem = 241790976, delta = 8724480
mem = 250490880, delta = 8699904
mem = 258985984, delta = 8495104
mem = 267599872, delta = 8613888
mem = 276353024, delta = 8753152
mem = 285454336, delta = 9101312
mem = 293945344, delta = 8491008
mem = 302751744, delta = 8806400
mem = 311685120, delta = 8933376

aatmdelissen avatar Dec 12 '24 11:12 aatmdelissen

[surprised] Tomas NAVARRETE reacted to your message:


From: Arnoud Delissen @.> Sent: Thursday, December 12, 2024 11:46:13 AM To: haasad/PyPardiso @.> Cc: Subscribed @.***> Subject: [haasad/PyPardiso] Potential memory leak in using PyPardisoSolver (Issue #76)

The PyPardisoSolver object loads the MKL library on each instance and needs to be released manually by the user, using free_memory(everything=True). If this is not done, memory is leaked and increases for each new solver instance (see example below). I suspect the reason for increasing memory is that the MKL library is loaded each time init() is called.

A possible solution might be a loader function that loads the MKL library and stores it in a global variable inside pypardiso.

Example:

import psutil import pypardiso import numpy as np import scipy.sparse as spsp

def test_memory_leak_pypardiso(): np.random.seed(0) K = spsp.rand(1000, 1000, 0.04) K = K + K.T # Make symmetric A = spsp.triu(K, format='coo') # Only use upper part # Explicitly set zero diagonal entries, as this is better for Intel Pardiso zero_diag_entries, = np.where(A.diagonal() == 0) if len(zero_diag_entries) > 0: A.row = np.append(A.row, zero_diag_entries) A.col = np.append(A.col, zero_diag_entries) A.data = np.append(A.data, np.zeros_like(zero_diag_entries)) A = A.tocsr()

b = np.random.rand(A.shape[0])

process = psutil.Process()
mem, memprev = process.memory_info().rss, 0.0
for i in range(20):
    solver = pypardiso.PyPardisoSolver(mtype=-2)
    solver.factorize(A)
    solver.solve(A, b)
    # solver.free_memory(everything=True)  # uncomment to release memory

    mem, memprev = process.memory_info().rss, mem
    print(f"mem = {mem}, delta = {mem - memprev}")  # in bytes

Output:

mem = 144121856, delta = 8441856 mem = 153714688, delta = 9592832 mem = 161259520, delta = 7544832 mem = 170823680, delta = 9564160 mem = 179695616, delta = 8871936 mem = 188497920, delta = 8802304 mem = 197685248, delta = 9187328 mem = 206245888, delta = 8560640 mem = 215117824, delta = 8871936 mem = 224055296, delta = 8937472 mem = 233066496, delta = 9011200 mem = 241790976, delta = 8724480 mem = 250490880, delta = 8699904 mem = 258985984, delta = 8495104 mem = 267599872, delta = 8613888 mem = 276353024, delta = 8753152 mem = 285454336, delta = 9101312 mem = 293945344, delta = 8491008 mem = 302751744, delta = 8806400 mem = 311685120, delta = 8933376

— Reply to this email directly, view it on GitHubhttps://github.com/haasad/PyPardiso/issues/76, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACPXT64XPPJM7WJEI7Q43432FFZQLAVCNFSM6AAAAABTPTFSTWVHI2DSMVQWIX3LMV43ASLTON2WKOZSG4ZTKNRWG4ZDAMI. You are receiving this because you are subscribed to this thread.Message ID: @.***>

tngTUDOR avatar Dec 12 '24 12:12 tngTUDOR

A few notes:

  1. I can reproduce that PyPardisoSolver instances don't free up the memory when no longer used. My guess is that this could be automated by adding a custom __del__ method to PyPardisoSolver that cleans up the memory when the instance gets garbage collected.

  2. In practical terms, I rarely encounter the need to create new PyPardisoSolver instances for each Pardiso call. Reusing the same instance or even using the default pypardiso.ps instance or the exposed pypardiso.spsolve wrapper` all avoid the memory leak. (Well, I guess there's the last call's memory that could be freed up).

  3. Otherwise, it's simple enough to just invoke free_memory() manually.

urob avatar May 26 '25 00:05 urob