dejavu
dejavu copied to clipboard
Fingerprinting a folder with ~10000 tracks consuming 45GB of RAM and it's growing.
Fingerprinting a folder with ~10000 tracks consuming 45GB of RAM and it's growing. It seems that the memory is leaked even when a song has completed recognition.
Have the same issue.
Using the given docker project with pgsql.
To avoid the error occurring, I made the following adjustment to my code.
import os import argparse import datetime from dejavu import Dejavu
# load config from a JSON file (or anything outputting a python dictionary) config = { "database": { "host": "db", "user": "postgres", "password": "password", "database": "dejavu" }, "database_type": "postgres" }
def write_fingerprinted_file(directory, files): """Creates a file with the current date and a list of files.""" with open(os.path.join(directory, 'fingerprinted.txt'), 'w') as f: f.write(f"Fingerprinted on: {datetime.datetime.now()}\n") f.write("Files:\n") for file in files: f.write(f"{file}\n")
if __name__ == '__main__': parser = argparse.ArgumentParser(description="Fingerprint audio files in a directory and its subdirectories.") parser.add_argument("directory", help="The path to the directory containing audio files.") args = parser.parse_args()
` # Supported audio file extensions: mp3, m4a, wav, etc. supported_extensions = [".mp3", ".m4a", ".wav"]
# create a Dejavu instance
djv = Dejavu(config)`
# Navigate through the main directory and each subdirectory for root, dirs, files in os.walk(args.directory): if root != args.directory: # To avoid fingerprinting the parent directory itself print(f"Fingerprinting directory: {root}") djv.fingerprint_directory(root, supported_extensions) write_fingerprinted_file(root, files)
the original code comes from the docker example file.
I made the first adjustment like this and the RAM usage went towards infinity.
`import json
import argparse
from dejavu import Dejavu
load config from a JSON file (or anything outputting a python dictionary)
config = { "database": { "host": "db", "user": "postgres", "password": "password", "database": "dejavu" }, "database_type": "postgres" }
if name == 'main': parser = argparse.ArgumentParser(description="Fingerprint audio files in a directory.") parser.add_argument("directory", help="The path to the directory containing audio files.") args = parser.parse_args()
# Supported audio file extensions: mp3, m4a, wav, etc.
supported_extensions = [".mp3", ".m4a", ".wav"]
# create a Dejavu instance
djv = Dejavu(config)
# Fingerprint all the supported audio files in the directory
djv.fingerprint_directory(args.directory, supported_extensions)`
With the adjustment above it works somehow, but of course it's still not optimal. The error still occurs, but the memory usage is limited due to the folder structure. This works for my case, but doesn't have to work for other cases.