best icon indicating copy to clipboard operation
best copied to clipboard

Clarification on GC Content Increment for G/C Reference Deletions

Open xianyu0623 opened this issue 1 year ago • 1 comments

Hello,

Firstly, I want to express my appreciation for this project! I’m currently learning Rust for bioinformatics, and your codebase is one of the best practical tutorials I’ve encountered. However, I’ve come across something in the code that I don’t fully understand, and I’m hoping for some clarification.

In the following section of the code: https://github.com/google/best/blob/c1c69bb3341834e71c2f109ee2b69b19b9a51190/src/stats.rs#L470-L474

During the processing of deletions, when a G or C is encountered in the reference, the GC content is incremented. Could you explain the reasoning behind this? My understanding is that a deletion would reduce the GC content since there would be fewer Gs or Cs in the read sample.

xianyu0623 avatar Sep 06 '24 07:09 xianyu0623

@xianyu0623 I believe what is happening here is that the deletion is accounted for in the reference sequence, and then the gc_content is accounted for from that.

I'm sorry for the delayed response - I was out for some time last year and had not seen this. Please let me know if you have any further questions.

danielecook avatar Feb 05 '25 15:02 danielecook