autofaiss
autofaiss copied to clipboard
Misunderstanding of the estimated computing time
I am not sure whether I misunderstand something or there is an error, but when building my index with autofaiss is written Train: 16.7 minutes
but takes ~11 secs Finished "Launching the whole pipeline" in 11.1440 secs
?
Using 16 omp threads (processes), consider increasing --nb_cores if you have more
Launching the whole pipeline 01/28/2022, 08:15:47
There are 4269 embeddings of dim 1024
Compute estimated construction time of the index 01/28/2022, 08:15:47
-> Train: 16.7 minutes
-> Add: 0.0 seconds
Total: 16.7 minutes
>>> Finished "Compute estimated construction time of the index" in 0.0000 secs
Checking that your have enough memory available to create the index 01/28/2022, 08:15:47
20.6MB of memory will be needed to build the index (more might be used if you have more)
>>> Finished "Checking that your have enough memory available to create the index" in 0.0009 secs
Selecting most promising index types given data characteristics 01/28/2022, 08:15:47
>>> Finished "Selecting most promising index types given data characteristics" in 0.0000 secs
Creating the index 01/28/2022, 08:15:47
-> Instanciate the index HNSW15 01/28/2022, 08:15:47
>>> Finished "-> Instanciate the index HNSW15" in 0.0036 secs
The index size will be approximately 17.2MB
The memory available for adding the vectors is 7.0GB(total available - used by the index)
Will be using at most 1GB of ram for adding
-> Adding the vectors to the index 01/28/2022, 08:15:47
Using a batch size of 244140 (memory overhead 953.7MB)
100%|██████████| 1/1 [00:00<00:00, 74.53it/s] >>> Finished "-> Adding the vectors to the index" in 0.1602 secs
>>> Finished "Creating the index" in 0.1647 secs
Computing best hyperparameters 01/28/2022, 08:15:47
>>> Finished "Computing best hyperparameters" in 3.3091 secs
The best hyperparameters are: efSearch=21
Compute fast metrics 01/28/2022, 08:15:50
2000
>>> Finished "Compute fast metrics" in 7.6499 secs
Saving the index on local disk 01/28/2022, 08:15:58
>>> Finished "Saving the index on local disk" in 0.0091 secs
Recap:
{'99p_search_speed_ms': 30.39110283832997,
'avg_search_speed_ms': 3.7983315605670214,
'compression ratio': 0.9678652870286923,
'index_key': 'HNSW15',
'index_param': 'efSearch=21',
'nb vectors': 4269,
'reconstruction error %': 0.0,
'size in bytes': 18066382,
'vectors dimension': 1024}
>>> Finished "Launching the whole pipeline" in 11.1440 secs
Yes the estimated time is not very good indeed. We could improve it