kineticsTools
kineticsTools copied to clipboard
Determine significant coverage and score value in the ipdSummary gff
Hi All,
I would like to know what is the coverage and score value in the ipdSummary gff to consider a modified base as confident.
Example:
seqname | source | feature | start | end | score | strand | frame | coverage | context | IPDRatio | CognateBase |
---|---|---|---|---|---|---|---|---|---|---|---|
genome | kinModCall | modified_base | 64078 | 64078 | 42 | - | . | 764 | TTCGCAAGAAGACCTGAAGACCCTAGTGAAGTTTCTTCTTC | 1.53 | C |
genome | kinModCall | modified_base | 63115 | 63115 | 21 | - | . | 759 | TATAGTGAAATGAGAGGGAGTTACGAGGAGCAATGTAATGC | 1.41 | T |
genome | kinModCall | modified_base | 63168 | 63168 | 28 | - | . | 759 | AGCCATGCTTCGTTTGTGGAGGGGTGAAACATTTAGCTAAG | 1.46 | G |
genome | kinModCall | modified_base | 63203 | 63203 | 62 | - | . | 757 | AGGAATCCACATGGTCACAAGGGCAGAGTCACAAGAGCCAT | 1.74 | G |
genome | kinModCall | modified_base | 61924 | 61924 | 73 | - | . | 756 | TTCGGGAACATGATCTTGGAGGTAAATGTTTTCCACATTGC | 1.87 | G |
Thanks
Score is dependent on coverage, and it isn't always possible to define a confidant cutoff. I would plot the data that you have (coverage vs score), the modified bases should form a distinct cluster. From the plot you should be able to define a function to distinguish modified from unmodified bases. Obviously this is much easier if you have some kind of control, known modified motif etc.