edamontology icon indicating copy to clipboard operation
edamontology copied to clipboard

Add missing samtools, bedtools, beagle formats

Open matuskalas opened this issue 2 years ago • 8 comments

Request from an email: Would it be possible for CRAM, CRAI and beagle to be added to the file formats to the ontology tree?

matuskalas avatar May 13 '22 13:05 matuskalas

The latest CRAM and CRAI (corresponding index) file format specification can be found at https://samtools.github.io/hts-specs/ - specifically http://samtools.github.io/hts-specs/CRAMv3.pdf

The latest Beagle file format specification can be found at https://faculty.washington.edu/browning/beagle/ - specifically https://faculty.washington.edu/browning/beagle/beagle_5.4_18Mar22.pdf

cristynkells avatar May 13 '22 13:05 cristynkells

Thank you so much @cristynkells 🙏🏽 This answered most of my questions before they were asked.

Beagle consumes and produces compressed VCF files. Is the Bref compression/format one needing to be added to EDAM?

matuskalas avatar May 13 '22 16:05 matuskalas

Good idea on Bref:

The latest Bref3 file format specification can be found at https://faculty.washington.edu/browning/beagle - specifically https://faculty.washington.edu/browning/beagle/bref3.24May18.pdf

cristynkells avatar May 16 '22 16:05 cristynkells

Could I add the BEDPE format to the list of additional data formats? The format is described here https://bedtools.readthedocs.io/en/latest/content/general-usage.html#bedpe-format

JenniferShelton avatar May 17 '22 17:05 JenniferShelton

Hi @matuskalas, we are still not seeing CRAI, Beagle, Bref, BEDPE under the file formats to the ontology tree. We see CRAM was added. Thank you!

cristynkells avatar Jun 03 '22 20:06 cristynkells

If I could chime with that we'd also like these (also saw CRAM but not CRAI) and then add tabix (.tbi) files as well: http://samtools.github.io/hts-specs/tabix.pdf ?

allisonheath avatar Jul 29 '22 13:07 allisonheath

Hi there, I think CSI files (htslib’s successor to the BAI index format) are also missing from Samtools' repertoire. Its file specifications can be found here.

Cheers, and thank you for your effort, guys! EDAM's format hierarchy is making my life way easier. We may soon rely on your ontology for metadata validation, so my name may pop-up in some other issues 😄

M-casado avatar Oct 06 '22 12:10 M-casado