taichi icon indicating copy to clipboard operation
taichi copied to clipboard

Is this memory leak?

Open RitChan opened this issue 2 years ago • 2 comments

Describe the bug Data-oriented class with python objects may cause memory leak.

To Reproduce

import gc
import os
import psutil
import taichi as ti


@ti.data_oriented
class X:
    def __init__(self):
        self.py_l = [0] * 5242880  # a list containing 5M integers (5 * 2^20)

    @ti.kernel
    def run(self):
        for i in range(1):
            pass


def get_process_memory():
    process = psutil.Process(os.getpid())
    mem_info = process.memory_info()
    return mem_info.rss


def main():
    ti.init(ti.cpu)
    for i in range(20):
        X().run()
        gc.collect()
        print(f"Iteration {i}, memory usage: {get_process_memory() / 1e6} MB")


if __name__ == '__main__':
    main()

Log/Screenshots

[Taichi] version 1.1.2, llvm 10.0.0, commit f25cf4a2, win, python 3.7.9
[Taichi] Starting on arch=x64
Iteration 0, memory usage: 272.740352 MB
Iteration 1, memory usage: 314.687488 MB
Iteration 2, memory usage: 356.634624 MB
Iteration 3, memory usage: 398.58176 MB
Iteration 4, memory usage: 440.528896 MB
Iteration 5, memory usage: 482.476032 MB
Iteration 6, memory usage: 524.423168 MB
Iteration 7, memory usage: 566.370304 MB
Iteration 8, memory usage: 608.31744 MB
Iteration 9, memory usage: 650.264576 MB
Iteration 10, memory usage: 692.211712 MB
Iteration 11, memory usage: 734.158848 MB
Iteration 12, memory usage: 776.105984 MB
Iteration 13, memory usage: 818.05312 MB
Iteration 14, memory usage: 860.000256 MB
Iteration 15, memory usage: 901.947392 MB
Iteration 16, memory usage: 943.894528 MB
Iteration 17, memory usage: 985.841664 MB
Iteration 18, memory usage: 1027.7888 MB
Iteration 19, memory usage: 1069.735936 MB

Additional comments Data-oriented class with only numpy objects is fine.

import gc
import os

import numpy as np
import psutil
import taichi as ti


@ti.data_oriented
class X:
    def __init__(self):
        self.np_l = np.zeros(shape=5242880 * 20, dtype="f4")  # 100 M
        # self.py_l = [0] * 5242880

    @ti.kernel
    def run(self):
        for i in range(1):
            pass


def get_process_memory():
    process = psutil.Process(os.getpid())
    mem_info = process.memory_info()
    return mem_info.rss


def main():
    ti.init(ti.cpu)
    for i in range(20):
        X().run()
        gc.collect()
        print(f"Iteration {i}, memory usage: {get_process_memory() / 1e6} MB")


if __name__ == '__main__':
    main()

Output:

[Taichi] version 1.1.2, llvm 10.0.0, commit f25cf4a2, win, python 3.7.9
[Taichi] Starting on arch=x64
Iteration 0, memory usage: 230.866944 MB
Iteration 1, memory usage: 230.875136 MB
Iteration 2, memory usage: 230.883328 MB
Iteration 3, memory usage: 230.887424 MB
Iteration 4, memory usage: 230.89152 MB
Iteration 5, memory usage: 230.895616 MB
Iteration 6, memory usage: 230.907904 MB
Iteration 7, memory usage: 230.912 MB
Iteration 8, memory usage: 230.916096 MB
Iteration 9, memory usage: 230.920192 MB
Iteration 10, memory usage: 230.924288 MB
Iteration 11, memory usage: 230.93248 MB
Iteration 12, memory usage: 230.936576 MB
Iteration 13, memory usage: 230.940672 MB
Iteration 14, memory usage: 230.944768 MB
Iteration 15, memory usage: 230.948864 MB
Iteration 16, memory usage: 230.957056 MB
Iteration 17, memory usage: 230.961152 MB
Iteration 18, memory usage: 230.965248 MB
Iteration 19, memory usage: 230.969344 MB

RitChan avatar Sep 22 '22 05:09 RitChan

What happens if you add a ti.sync() after each run?

bobcao3 avatar Sep 22 '22 07:09 bobcao3

Reproduced the same result on my machine, ti.sync() cannot help.

It's likely to be a memory leak.

turbo0628 avatar Sep 22 '22 08:09 turbo0628

This memory leak is also observed if we switch the list to numpy array, if we use np.random.random([5242880]) instead of np.zero(...).

The problem with np.zero(..) is that all elements point to a single memory of zero therefore the memory for np.zeros[5242880] is fairly small.

jim19930609 avatar Sep 30 '22 05:09 jim19930609

  1. Diagnose

The root cause for this memory leak lies in the following dependency chain: program->compiled_kernels[...] = run_k run_k->mapper->mapping = X()

In simple words, program holds kernel holds instance X, therefore X shares the same lifetime as program, surviving till the end of the python process.

  1. Possible fixes:

(1) Remove the cache (self.mapping) for parsed arguments: 2022-09-30 13-20-33 的屏幕截图

(2) Add scope support for cached kernels, where program should remove the created kernels from compiled_functions once exiting a certain scope

jim19930609 avatar Sep 30 '22 05:09 jim19930609