gpusimilarity icon indicating copy to clipboard operation
gpusimilarity copied to clipboard

Merging multiple databases not working

Open Mariewelt opened this issue 4 years ago • 1 comments

Hey @lorton

I tried to launch a search over a huge database and used gpusim_mergedb.py to process the database files in parallel. However, merging didn't work. It created empty files. After a little bit of digging into the problem, I found the cause.

The gpusim_createdb.py writes 4 values to the top of each database

    qds.writeInt(DATABASE_VERSION)
    qds.writeString(args.dbkey.encode())
    qds.writeInt(gpusim_utils.BITCOUNT)
    qds.writeInt(count)

However, gpusim_mergedb.py read (and then writes to a merged fsim file) only 3: everything except for dbkey. First reading the dbkey for each database and then writing it to a merged file solves the problem.

Mariewelt avatar Oct 03 '20 22:10 Mariewelt

Thank you very much @Mariewelt for doing this investigation! We don't have sufficient testing set up of merge_database.py obviously, so this was missed. Do you want to do a pull request of your changes?

lorton avatar Oct 04 '20 14:10 lorton