BirdNET-Analyzer icon indicating copy to clipboard operation
BirdNET-Analyzer copied to clipboard

Linear speed drop using embeddings with hoplite (#562) due to excessive database commits

Open FloMee opened this issue 7 months ago • 1 comments

Description: I was very happy to see that you introduced the "embeddings with hoplite" feature with #562. Unfortunately I've observed a linear speed drop while analyzing hundreds of sound files:

Image

Using the pyinstrument module, I've confirmed that the issue lies in the excessive database commits during the analysis process:

Image

Solution: I've solved the problem for me with moving the db.commit() from line 107 in embeddings/utils.py to line 209 before the db.db.close()

Question: I'm aware of the fact that my code changes (committing to the database just after all files are analyzed) might introduce the risk of data loss. However I think committing the results of every single chunk is not useful either.

What do you think could be a feasible solution?

FloMee avatar Mar 06 '25 15:03 FloMee