zarp icon indicating copy to clipboard operation
zarp copied to clipboard

feat: support GTF formats other than ensembl

Open ninsch3000 opened this issue 1 year ago • 0 comments

Is your feature request related to a problem? Please describe. Currently, ZARP has only been tested with/only works with ensembl style GTF files. In order to use GTF files from other sources some adaptations have to be made to the workflow.

Describe the solution you'd like Identify which changes have to be made in the workflow (Probably chromosome names and other fields will not be correctly parsed) and make the appropriate changes.

From gitlab comment of @mkatsanto:

Currently we only support Ensembl. Maybe we could also test refseq and gencode. This would required conversion for the chromosome name notation e.g and some other fields towards the end.

  • Check results for genes that are the same across different annotations
  • Check if the extra features (3utrs for example) are considered by alfa or this causes an issue.
  • Which rules do we expect to produce different results?
  • Make sure in the documentation that we mainly support Ensembl.
  • Mention we haven't tested extensively for all organisms so things might not work properly for rare organisms.

Describe alternatives you've considered Mention in the documentation that only ensembl style GTFs are supported

ninsch3000 avatar Mar 10 '23 16:03 ninsch3000