autofaiss icon indicating copy to clipboard operation
autofaiss copied to clipboard

Misunderstanding of the estimated computing time

Open louis030195 opened this issue 2 years ago • 1 comments

I am not sure whether I misunderstand something or there is an error, but when building my index with autofaiss is written Train: 16.7 minutes but takes ~11 secs Finished "Launching the whole pipeline" in 11.1440 secs?

Using 16 omp threads (processes), consider increasing --nb_cores if you have more
Launching the whole pipeline 01/28/2022, 08:15:47
There are 4269 embeddings of dim 1024
	Compute estimated construction time of the index 01/28/2022, 08:15:47
		-> Train: 16.7 minutes
		-> Add: 0.0 seconds
		Total: 16.7 minutes
	>>> Finished "Compute estimated construction time of the index" in 0.0000 secs
	Checking that your have enough memory available to create the index 01/28/2022, 08:15:47
20.6MB of memory will be needed to build the index (more might be used if you have more)
	>>> Finished "Checking that your have enough memory available to create the index" in 0.0009 secs
	Selecting most promising index types given data characteristics 01/28/2022, 08:15:47
	>>> Finished "Selecting most promising index types given data characteristics" in 0.0000 secs
	Creating the index 01/28/2022, 08:15:47
		-> Instanciate the index HNSW15 01/28/2022, 08:15:47
		>>> Finished "-> Instanciate the index HNSW15" in 0.0036 secs
The index size will be approximately 17.2MB
The memory available for adding the vectors is 7.0GB(total available - used by the index)
Will be using at most 1GB of ram for adding
		-> Adding the vectors to the index 01/28/2022, 08:15:47
Using a batch size of 244140 (memory overhead 953.7MB)
100%|██████████| 1/1 [00:00<00:00, 74.53it/s]		>>> Finished "-> Adding the vectors to the index" in 0.1602 secs
	>>> Finished "Creating the index" in 0.1647 secs
	Computing best hyperparameters 01/28/2022, 08:15:47

	>>> Finished "Computing best hyperparameters" in 3.3091 secs
The best hyperparameters are: efSearch=21
	Compute fast metrics 01/28/2022, 08:15:50
2000
	>>> Finished "Compute fast metrics" in 7.6499 secs
	Saving the index on local disk 01/28/2022, 08:15:58
	>>> Finished "Saving the index on local disk" in 0.0091 secs
Recap:
{'99p_search_speed_ms': 30.39110283832997,
 'avg_search_speed_ms': 3.7983315605670214,
 'compression ratio': 0.9678652870286923,
 'index_key': 'HNSW15',
 'index_param': 'efSearch=21',
 'nb vectors': 4269,
 'reconstruction error %': 0.0,
 'size in bytes': 18066382,
 'vectors dimension': 1024}
>>> Finished "Launching the whole pipeline" in 11.1440 secs

louis030195 avatar Jan 28 '22 07:01 louis030195

Yes the estimated time is not very good indeed. We could improve it

rom1504 avatar Jan 28 '22 07:01 rom1504