gcv
gcv copied to clipboard
investigate conditions under which inversions not represented in macrosynteny blocks
from earlier email exchanges: adf :
One thing while it's on my mind, you had mentioned inversions in the context of the chord view and I think I have noticed that the new block generation algorithm is not finding them as often as they used to occur in the precomputed data. I think I had observed something along these lines when I was doing my own block construction using MCScanX on the gene families, but my impression is that it is more pronounced now (though I have not verified this, so we may have simply inherited the behavior from the algorithm you chose as your model?); anyway, I'd say that it would be nice to be able to detect inversions at the macroscale that are at least as big as the classic example which has 6 matched genes in the inverted segment (I tried cranking the macro-params to allow detection but it did not appear, though it's certainly possible that I have overlooked a subtlety); I have a vague recollection that this did appear in Steven's precomputed blocks, but would not swear to it.
@alancleary
Not sure why this would be happening; inverted blocks are computed the exact same way as forward oriented blocks, and the two result sets have no effect on each other. At least they shouldn't... I'll look into it.
While addressing Issue #116 I reviewed the code in question and all seems to be in order. So perhaps, as you've suggested @adf-ncgr, this is expected behavior inherited from the model algorithm.
It just occurred to me (took a while) that we were having similar issues with inversions in the micro-synteny viewer, that is, some inversions that were expected weren't appearing. This was happening with both the repeat and msa algorithms. The solution we came up with was to prefix each gene family in the gene family sequences being aligned with the orientation of their corresponding gene, thus requiring the family AND orientation to be the same in order for them to match.
I just checked the code of the on-demand macro-synteny service; gene orientation is not being taken into consideration. Will investigate when time permits.