pysradb icon indicating copy to clipboard operation
pysradb copied to clipboard

[BUG] gsm_to_srx false positives (reanalysis)

Open Maarten-vd-Sande opened this issue 1 year ago • 1 comments

Describe the bug

Not sure if it's a bug on the pysradb side or SRA. But I seem to get some false positives:

import pysradb
db_sra = pysradb.SRAweb()
db_sra.gsm_to_srx(["GSM1155957"])

  experiment_alias experiment_accession
3       GSM1621354            SRX893751
4       GSM1621353            SRX893750
5       GSM1621352            SRX893749
6       GSM1621351            SRX893748
7       GSM1155957            SRX298000

Could be because the other GSM numbers are a re-analysis of the previous? https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1155957

Maarten-vd-Sande avatar Jul 22 '22 08:07 Maarten-vd-Sande

That is correct - the results reflect what we see on the search page: https://www.ncbi.nlm.nih.gov/gds/?term=GSM1155957 We could handle this internally, but for now I would recommend subsetting based on exact string match.

saketkc avatar Jul 23 '22 14:07 saketkc