insightface
insightface copied to clipboard
GPU memory leak while using retinaface
After calling detector.detect(), randomly the total GPU memory consumption will increase. It will cause out of memory error in the end.
This occurs on CPU too. I'm iterating through a list of images, using model.detect() on each (as defined in the insightface.ai tutorial). The memory increase seems to be (at least mostly) caused by high resolution images. After running model.detect(high_res_img), memory usage increases and does not drop when subsequently running model.detect(low_res_img). Eventually a memory allocation error occurs. Afterwards, if the loop is restarted from the beginning, the error will occur earlier and earlier in the loop due to memory not being cleared. Running "del model" drops the memory back to normal. Error message:
File "C:\Users\k64\AppData\Local\Programs\Python\Python38\lib\site-packages\insightface\model_zoo\face_detection.py", line 303, in detect scores = net_out[idx].asnumpy() File "C:\Users\k64\AppData\Local\Programs\Python\Python38\lib\site-packages\mxnet\ndarray\ndarray.py", line 1993, in asnumpy check_call(_LIB.MXNDArraySyncCopyToCPU( File "C:\Users\k64\AppData\Local\Programs\Python\Python38\lib\site-packages\mxnet\base.py", line 253, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) mxnet.base.MXNetError: [23:21:18] c:\jenkins\workspace\mxnet-tag\mxnet\src\storage./cpu_device_storage.h:72: Failed to allocate CPU Memory
Can you please list your environment like OS/CUDA/MXNet versions?
Windows 10 no CUDA - using CPU python 3.8.1 mxnet_mkl-1.5.0-py2.py3-none-win_amd64 numpy=1.18.1
Don't know much about windows. I can only suggest you try it on linux. And also fix the input size.
It took me a while to get linux set up. Now I'm on Ubuntu 18.04 and I see the same thing happening. I don't get an error message, instead my system just freezes up when I run out of memory. I tested running through a set of images a) normally, b) preparing the model each iteration, c) defining and preparing the model each iteration. At the end of both a) and b), memory usage was 25% higher than the peak memory usage under method c). For some reason, the model is failing to release memory that it uses. I have tested re-detecting the same image repeatedly and this doesn't seem to cause the problem, but on a list of images, memory increases on the larger ones and doesn't fully reset afterwards.
@nttstar Do you think resnet34/resnet18/resnet10 versions of retinaface would be good enough to use in live video streams ?
@k128 Try resizing/padding your input images to the same size.
@k128 MXNet inference engine may maintain a static networks pool, and create a new handler for each different input resolution in the meantime.
@nttstar @k128 , fixing the image size could sovle the problem.
@nttstar What's the recommended method for resizing?
@nttstar Thanks, resizing and padding solved the problem for me. Or at least made it much more manageable. I resize the largest dimension to the closest one from a fixed set of sizes and pad the other dimension to the same set of fixed sizes (picking the closest greater one).
@lorenzob how to resize picture, i've pad the image to (1024, 1024), but the memory still growing
Hi, is there a workaround where I won't resize because some details will be missed if I do so? like deleting these newly made handlers or something
@woreom Just pad without resizing:
pad_sizes = np.asarray([150, 250, 300, 350, 500, 600, 720, ..., 4096, 5000, 6000, 7000, 8000, 9000, 10000])
def pad_image(img):
h, w, *_ = img.shape
if max(h, w) > pad_sizes[-1]:
raise Exception(f"Image too large: {img.shape}")
mask_w = pad_sizes >= w
new_w = pad_sizes[mask_w][0]
w_diff = new_w - w
mask_h = pad_sizes >= h
new_h = pad_sizes[mask_h][0]
h_diff = new_h - h
if h_diff + w_diff == 0:
return img
padded = cv2.copyMakeBorder(img, 0, h_diff, 0, w_diff, cv2.BORDER_CONSTANT, value=WHITE)
#cv2.imshow("input", img)
#cv2.imshow("padded", padded)
#cv2.waitKey(0)
return padded
@woreom Just pad without resizing:
pad_sizes = np.asarray([150, 250, 300, 350, 500, 600, 720, ..., 4096, 5000, 6000, 7000, 8000, 9000, 10000]) def pad_image(img): h, w, *_ = img.shape if max(h, w) > pad_sizes[-1]: raise Exception(f"Image too large: {img.shape}") mask_w = pad_sizes >= w new_w = pad_sizes[mask_w][0] w_diff = new_w - w mask_h = pad_sizes >= h new_h = pad_sizes[mask_h][0] h_diff = new_h - h if h_diff + w_diff == 0: return img padded = cv2.copyMakeBorder(img, 0, h_diff, 0, w_diff, cv2.BORDER_CONSTANT, value=WHITE) #cv2.imshow("input", img) #cv2.imshow("padded", padded) #cv2.waitKey(0) return padded
thank you for the response but the code above result in this error
TypeError: '>=' not supported between instances of 'ellipsis' and 'int'
I'm opening the image with opencv
Replace the three dots in the pad_sizes array with the values you want to use: 800, 1000, etc.
The list was too long so I removed some values and put ... there.
closest one from a fixed set of sizes and pad the other dimension to the same set of
@nttstar Still don't get it, why resizing or padding can resolve the memory increasing / leak issue ?
i have huge gpu memory leak with below method on kaggle
faces = RetinaFace.detect_faces(image_path)