Michael Macias

Results 68 comments of Michael Macias

Nice, this is a great initiative! Let's include something like this. Even though, as you mentioned, it's more algorithmic than I/O, pileup is a fairly common operation and is likely...

Thanks for the clarification. I found that this case was previously omitted in https://github.com/samtools/hts-specs/pull/496 due to the same confusion, so I agree it should be defined in the spec.

noodles-fasta can now seek/query bgzipped FASTA files. See [`fasta::IndexedReader`](https://docs.rs/noodles/0.29.0/noodles/fasta/struct.IndexedReader.html) and the [`fasta_query`](https://github.com/zaeleus/noodles/blob/e8c5fe9b1c9d4cd4f8801936ea6692397c8c589c/noodles-fasta/examples/fasta_query.rs) example.

In this case, noodles' SAM header parser is not overly strict. It is, however, spec-compliant. From [Sequence Alignment/Map Format Specification (2022-08-22) § 1.3 "The header section"](https://samtools.github.io/hts-specs/SAMv1.pdf#subsection.1.3): > Platform/technology used to...

If the platform field value is the only blocker when reading, I would suggest preprocessing the raw SAM header before parsing, e.g., [1) just `illumina` or 2) generalized](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=116ec74955534e77c11c89242701c414). > do...

Sorry for the long response time, and thanks for your patience. I don't think this is the best approach to this problem. Readers in noodles are largely agnostic to indices,...

Thanks for your interest and possible solution. A different approach was implemented by delegating to a seekable raw reader. See `{bgzf,fasta}::IndexedReader`.

Thanks for testing, @jkbonfield. 1) There's been no work to select more appropriate/optimal codecs for data series, so the current implementation will simply use gzip for all block data. There...

I'm closing this as stale. The genotypes parser has been improved since this issue, so do tell if you're still receiving the error.

Thanks for looking at this in the past. I'm closing this since the alignment parsers/writers have now diverged greatly and makes the same checks.