predicting a large amount of data
When predicting a large amount of data, process patches are divided several times, and there is a problem that one data cannot be saved when saving data after one process patches.
In my case, when predicting approximately 24000 images(tile), data on 8 images are not stored.
Also, it takes too long to predict. About 50 minutes per 1,000 (500x500) images.
Experimental environment. model : kumar, consep data : (500, 500, 3) .png data batch size : 32 gpu(0, 1) : 0->tesla t4 16G, 1->tesla t4 16G
Hi @essential2189 ,
So you are processing 24,000 image tiles and 8 / 24,000 are not showing results? It is hard to give an exact diagnosis as to why this is happening. As a workaround you could rerun the code on the images that failed to process.
To make things faster, you can use PanNuke checkpoint with fast mode. Also you can turn off the overlay function which can slow things down a bit and only overlay when it is necessary.
Yes 8/24000 are not showing result.
This is my code to check the missing result
data = '20220110-09:14_1104-1'
img_data = '1104-1'
mat_path = '../../output/kumar/' + data + '/mat/' # hover_net result
image_path = '../../datasets/image500/' + img_data + '/' # my image datasets
name_list = []
for file in sorted(os.listdir(mat_path)):
name = file.split('.')[0]
name = name.split('_')[1]
name_list.append(int(name))
for i in range(len(os.listdir(image_path))):
if i not in name_list:
print(i)
print(len(os.listdir(image_path)), len(os.listdir(mat_path)))
and the print output is
647
2377
4051
8888
13545
16374
18922
21352
24595 24587
As you can see, 8 results are missing.
I think it's a problem that occurs when uploading or downloading data to gpu memory, what about you?
I think the problem is caused by improper memory management of the program. In 'infer/tile.py' class InferManger.process_file_list(). The program frist pop() a file path : file_path = file_path_list.pop(0), if available_ram <0(available_ram -= expected_usage), this image will be abandoned. I think changing the way of pop should solve this problem.