gpusimilarity
gpusimilarity copied to clipboard
Merging multiple databases not working
Hey @lorton
I tried to launch a search over a huge database and used gpusim_mergedb.py to process the database files in parallel. However, merging didn't work. It created empty files. After a little bit of digging into the problem, I found the cause.
The gpusim_createdb.py writes 4 values to the top of each database
qds.writeInt(DATABASE_VERSION)
qds.writeString(args.dbkey.encode())
qds.writeInt(gpusim_utils.BITCOUNT)
qds.writeInt(count)
However, gpusim_mergedb.py read (and then writes to a merged fsim file) only 3: everything except for dbkey. First reading the dbkey for each database and then writing it to a merged file solves the problem.
Thank you very much @Mariewelt for doing this investigation! We don't have sufficient testing set up of merge_database.py obviously, so this was missed. Do you want to do a pull request of your changes?