chromosomer icon indicating copy to clipboard operation
chromosomer copied to clipboard

Assembled chromosome output

Open quocviet0908 opened this issue 6 years ago • 4 comments

Hi gtamazian. I followed your tutorial how to assembly the chromosome and it did rather smoothly. However when I checked the output file, I noticed the chromosome sequence was seperated by the Fragment identifier like ...TGGCTTGAA >FRAGMENT2 CTCAGGGA...

Is this result okay?

quocviet0908 avatar Jun 25 '18 14:06 quocviet0908

Hello @quocviet0908 ,

Does your reference genome contain multiple chromosomes? It looks like the genome fragments were mapped to several chromosomes leading to more than one fragment in the produced assembly.

gtamazian avatar Jun 27 '18 11:06 gtamazian

Yeah I followed your tutorial, the demo_chromosome.fa contains multiple chromosomes in that file. I also want to consult a another problem. My .fna file contains some scaffolds, but when I blasted them on the reference genome (Bacillus subtilis 168), every scaffold was split and return multiple hits which made me puzzled since I couldn't find the right gap value for the fragment map.

For instance, could you please download this .rar file and look into it? I extracted the sequence of the scaffold 17 and its blasted information in the excel file.

http://www.mediafire.com/file/ar92su98idt4yll/test_chromosomer.rar/file

quocviet0908 avatar Jun 28 '18 03:06 quocviet0908

It is quite normal that the alignments were split and you got multiple hits. According to the file that you attached, there is one high-score hit with the length of 610 bp, the identity of 99% and the bit score 1101. Its bit score surpasses the score of the next hit, that is 235, and Chromosomer will use the high-score hit as an anchor.

The gap value can be obtained by comparing sizes of assembled fragments and a reference genome. If we denote the total size of the assembled fragments by A and the total reference genome size by R, then the required gap size G can be obtained by the following equation:

G = |A - R|/(N - M)

where N is the number of the assembled fragments and M is the number of the reference chromosomes. In your case, R = 4,215,606 and M = 1.

gtamazian avatar Jul 03 '18 13:07 gtamazian

Hi @gtamazian I'm appreciated for your help, I'll try to solve this with your guidance.

quocviet0908 avatar Jul 05 '18 13:07 quocviet0908