open-metric-learning icon indicating copy to clipboard operation
open-metric-learning copied to clipboard

An error will be reported during the metric calculation process of self-created data set

Open snow-wind-001 opened this issue 1 year ago • 2 comments

image We copied the format of the cars196 data set and made our own data set, with a total of 172 classes and a total of 450,000 pieces of data. The training process goes very smoothly, but when calculating indicators after performing val operations, a large amount of memory will be occupied until it crashes. Please help me solve the problem.

snow-wind-001 avatar May 12 '24 15:05 snow-wind-001

@chang48 @dapladoc @churnikov @alexmelekhin

snow-wind-001 avatar May 12 '24 15:05 snow-wind-001

Hi, @snow-wind-001

It's a known problem. The current implementation requires storing 450k x 450k floats in your memory. Right now I'm working on the solution to make it memory optimized. It will be released soon with OML 3.0 (in a week or max two).

You may wait or you may do the following: Decrease the size of your validation set, for example, keep 45k items. You may also want to split your validation set into non overlapping queries and galleries (for CARS they do overlap because is_query == is_gallery == True). The idea is you may use 20k items as queries and 25k as galleries because 25k * 20k << 45k * 45k. You may use DeepFashion as a reference because it has the same logic of query gallery split. It may be also convenient to use this function to check the format of the resulted dataframe. More info on datasets here.

AlekseySh avatar May 12 '24 17:05 AlekseySh