plsdb icon indicating copy to clipboard operation
plsdb copied to clipboard

Linking UIDs

Open keplermears-hms opened this issue 6 months ago • 2 comments

The previous iteration of the database, PLSDB_2023_11_03_v2, had a single csv containing assembly, biosample and nuccore IDs for all the plasmids. Now it seems like that information is split up, making it difficult to cross reference. Moreover, it appears like some information is missing for example the first entry nuccore uid 2737857725 does have an assembly you can find on NCBI, GCA_039954855.1, but there is no reference to this assembly in the assembly.csv which is a little concerning unless I have made a mistake. It would be useful to have a similar datatable as the previous iteration that directly link all the UIDs. Is this available?

keplermears-hms avatar May 12 '25 19:05 keplermears-hms

+1 - I would like to update the database for Plassembler https://github.com/gbouras13/plassembler and it would be amazing to have the same datatable as the previous release

George

gbouras13 avatar May 21 '25 07:05 gbouras13

+1

I tried to recreate the "old" metadata table format as best as I could (see the script and a companion script to compare the output to the old metadata file), but I had to make a lot of assumptions and guesses. An official way to recreate a table like the "old" metadata file would be appreciated.

Best, Richard

richardstoeckl avatar Jun 05 '25 07:06 richardstoeckl

Can we get an update on this issue?

richardstoeckl avatar Nov 17 '25 14:11 richardstoeckl