ngs-tools icon indicating copy to clipboard operation
ngs-tools copied to clipboard

Relationship between the STAT analysis data available on the NCBI SRA Run Browser and that on the cloud platform

Open Junna-Kawasaki opened this issue 1 month ago • 2 comments

I am writing to seek your assistance with a question regarding the Cloud-based Taxonomy Analysis Information Table.

I noticed a discrepancy between the “identified_spot_count” available on the cloud platform and the "IDENTIFIED READS" displayed on the Sequence Read Archive Run Browser. For instance, in the case of ERR979125 on the Run Browser, 97.1% of the reads are listed as being of human origin (https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=ERR979125&display=analysis).

However, the data retrieved from the cloud shows the following:

  • Identified Spot Count: 1945935
  • Analyzed Spot Count: 12149592
  • Total Spot Count: 789675109

Regardless of whether the denominator is the “analyzed_spot_count” or the “total_spot_count”, the percentages are significantly lower than those reported on the Run Browser (16.0% and 0.24%).

Could you kindly clarify the relationship between the data available on the Run Browser and that on the cloud platform?

I appreciate your assistance and look forward to your response.

Junna-Kawasaki avatar May 09 '24 02:05 Junna-Kawasaki