ezaai icon indicating copy to clipboard operation
ezaai copied to clipboard

list of database profiles A agains profiles B

Open jianshu93 opened this issue 1 year ago • 7 comments

Dear EzAAI team,

I am wondering whehter it is possible to provide an options when computing many pairs, like in fastANI, a list of query database, and a list of reference database then compare each of the query with each of the reference. This can be extremely useful for large number of pairs and also testing benchmarking speed. Same suggestions for orthoANIu (usearch is open source!) but no idea where to request the features.

Thanks, Jianshu

jianshu93 avatar Jul 22 '24 05:07 jianshu93

Dear Jianshu,

This is possible by using a directory with multiple databases as an input of the calculate module. The module will automatically collect all databases and perform many-versus-many comparison. Please refer to our tutorial at: http://leb.snu.ac.kr/ezaai/tutorial#calculate

endixk avatar Jul 30 '24 03:07 endixk

If you are requesting for an option to provide a list to define a subset of the input directory, or a collection that spans across multiple directories, that is currently impossible. One workaround would be to generate a separate folder and copy/move over the databases you want to compute with.

endixk avatar Jul 30 '24 03:07 endixk

Hello @endixk,

Yes I was asking for a list of files, but the fold should also work it seems. How about extract and build the database, also a fold of genomes can be extracted all together right?

Thanks,

Jianshu

jianshu93 avatar Jul 30 '24 03:07 jianshu93

Dear Jianshu,

Unfortunately no, extract module currently works with one genome at a time. This is because back then I didn't have a clue to automatically assign database names or labels for arbitrary input file names.

For this I agree that this is not super friendly for massive inputs, and I think it won't be too difficult to develop batch input for the extraction. I'll try to implement this before next release.

endixk avatar Jul 30 '24 04:07 endixk

Thank you so much! This will make it very useful for large scale comparisons.

Jianshu

jianshu93 avatar Jul 30 '24 05:07 jianshu93

Hi, sorry for being super late to come back to this. I recently made an update that implements this feature in which extract module is now capable of batch extraction from a directory using multiple threads. I will soon release a new stable version with this feature updated.

endixk avatar Jul 17 '25 13:07 endixk

Thanks!I look forward to it!--Jianshu

jianshu93 avatar Jul 17 '25 21:07 jianshu93