mango icon indicating copy to clipboard operation
mango copied to clipboard

Mango UI does display the complete patterns for Genome sequence

Open ssabnis opened this issue 7 years ago • 11 comments

I am able to finally run the mango, But I do not see the patterns appear for the range on the browser from the reads, is UI issue is related to data? if so can someone suggest a better dataset where I can see all the reads appear for the range. Thanks in advance.

Command ./mango-submit ~/GCA_000001405.15_GRCh38_full_analysis_set.2bit -reads hdfs://hadoop-mynamenode:9000/NA12878_phased_possorted_bam.alignments.adam

image

ssabnis avatar Oct 20 '18 04:10 ssabnis

@akmorrow13 Can you shed some light on this, I do not see the whole patterns in the bottom portion of the browser. Thanks

ssabnis avatar Oct 22 '18 17:10 ssabnis

Can you right click and see if there is a javascript console error? Also, when did you last build? There was a recent bug fix that may be related.

akmorrow13 avatar Oct 22 '18 18:10 akmorrow13

@akmorrow13 I just built new code base, still the same issue, no javascript errors.Is that something to do with the data set I am using?

ssabnis avatar Oct 22 '18 21:10 ssabnis

Oh, my apologies. There is data (see the coverage track). To view reads, click on the settings wheel to "Show Alignments". By default, Mango only shows coverage. I can change this to show alignments by default.

akmorrow13 avatar Oct 22 '18 22:10 akmorrow13

@akmorrow13 , cool thanks, yes that was it. I think default show is better. Thanks much.

ssabnis avatar Oct 22 '18 22:10 ssabnis

@akmorrow13 whenever I change the the range on the browser, I see the mango send query request. How does that work, by default is the spark setup to run in cluster mode? at that time. It seems it is running on driver. or I may be I am wrong,. I think it slows down the overall rendering in the browser as well.

ssabnis avatar Oct 22 '18 22:10 ssabnis

There is a driver cache and a spark RDD cache. When the data is in the driver cache, it just grabs data through there. When the data is not on the driver cache, it goes to Spark memory, which takes longer. When the data is also not in the RDD, it has to go to storage, which takes the longest.

akmorrow13 avatar Oct 22 '18 22:10 akmorrow13

@akmorrow13 thanks the UI worked for me. I have this UI, can you help in interpreting this, Like what it is that I am looking at. It will give me a sense of understanding. Can you share few words. Thanks

image

ssabnis avatar Oct 29 '18 03:10 ssabnis

@ssabnis it looks like your reference does not match your sample. Maybe you're sample is aligned to hg19, instead of hg38. The top track is showing you coverage of the sample you are viewing, and the bottom track shows the raw reads at this location in the genome.

akmorrow13 avatar Oct 29 '18 03:10 akmorrow13

@akmorrow13 thanks, I have the following file, HG001.hs37d5.300x.bam can you suggest the corresponding fna and vcf files to be used with mango

ssabnis avatar Oct 29 '18 03:10 ssabnis

A vcf file can be any variant file you choose to view. Some example files can be found at http://www.internationalgenome.org/data/ and are free to download and play around with. You can download vcf files here and visualize the in the Mango browser as --variants.

akmorrow13 avatar Oct 29 '18 19:10 akmorrow13