chimeraviz
chimeraviz copied to clipboard
Better document how to create a file with protein domain coordinates
Although documented here, it should be easier to create the data needed for the protein domain plot. This issue will track my progress on this.
Initial ideas:
- create helper scripts that'll make it easy to generate the file needed
- create a vignette describing how the file can be created
Hi I'm still a bit confuse about this. So for example I have a bed file that I got from ucsc as such,
chr11 | 114242180 | 114242249 | zinc finger | 1000 | + | 114242180 | 114242249 | 100,100,0 | 1 | 69 | 0 | Manually reviewed (Swiss-Prot) | zinc finger region | amino acids 490-512 on protein Q05516 | C2H2-type 4 | Q05516
so this is a region that is a known zinc finger, however how will I then convert this so that I cause use it with ?
Since I received another request by email to better document this process, I'll just note here that I have not worked on this issue since it was created on on Mar 6 2018, and I likely will not in the coming months. All the details I have at the moment are here: https://github.com/stianlagstad/chimeraviz/blob/80466b8fa6c7eb8fa21dc64f90a7a96d01f94e81/R/extdata.R#L192. That is:
#' protein_domains_5267 bed file
#'
#' Documentation for the protein_domains_5267.bed file containing protein
#' domains for the genes in the fusion with cluster_id=5267.
#'
#' @name raw_fusion5267proteindomains
#'
#' @section protein_domains_5267.bed:
#'
#' This file is an excerpt from a larger file that we created by:
#' - downloading domain name annotation from Pfam database (PfamA version 31)
#' and domain region annotation from Ensembl database through BioMart API
#' - switching the domain coordinates in the protein level to these in
#' transcript level.
NULL
Some things can also be learned by looking at the code behind the protein domain plot: https://github.com/stianlagstad/chimeraviz/blob/80466b8fa6c7eb8fa21dc64f90a7a96d01f94e81/R/plot_fusion_transcript_with_protein_domain.R.
If anyone does look into this then I would be very happy to approve a PR which adds further documentation of how to do this.