IsoformSwitchAnalyzeR
IsoformSwitchAnalyzeR copied to clipboard
Error with "NA" gene_name entry during isoform switch plots
I am trying to run the second part of the analysis based on StringTie data on an improved genome annotation based on an initial Ensembl genome.
With the default Ensembl genome I can run the complete the workflow. When using the merged StringTie annotation however, I am getting the following error:
Step 1 of 5 : Importing external sequence analysis...
Step 2 of 5 : Analyzing alternative splicing...
Step 3 of 5 : Prediciting functional consequences...
Step 4 of 5 : Making indidual isoform switch plots...
|=========================== | 38%Error in .fun(piece, ...) :
Something went wrong with the gene_id/gene_name selection.
Please make sure your switchAnalyzeRlist was created with the newest version of IsoformSwitchAnalyzeR and try again.
Else contact developer with reproducible example.
Error: with piece 119:
gene_id gene_name condition_1 condition_2 gene_switch_q_value
1 MSTRG.16255 P4htm Control 02h 0.004894553
2 MSTRG.16255 <NA> Control 02h 0.004894553
switchConsequencesGene comparison switchConsequencesGeneDescription
1 TRUE Control_vs_02h with_consequences
2 TRUE Control_vs_02h with_consequences
outputName rank
1 ont//Control_vs_02h/ 39
2 ont//Control_vs_02h/ 40
No traceback available
In essence it looks like a duplicate line in one of the data frames - only of them has an assigned gene name, but the q-value is identical. I am right now running the latest version from the GitHub repository. The error was identical in the version distributed through bioconductor.
When playing around with cutoff values I also can see the same error coming up for other genes, also showing one cloned NA line.
UPDATE: I tried to just exclude NA gene names as a test from the switchAnalyzeRlist but now see the identical behavior with 2 identical lines from the same transcript, but in the new case one is from a protein coding gene while the other line is from a miRNA annotation. It seems like this might be related to overlapping annotations.
I would be happy for any idea how to fix the issue.
Thank you, Tobias
Hi Tobias
Thanks for reaching out. Which version of IsoformSwitchAnalyzeR are you using? You can check that via this R command:
packageVersion('IsoformSwitchAnalyzeR')
If you are not already using 1.9.3 could I get you to try that by updating via this command:
if (!requireNamespace("devtools", quietly = TRUE)){
install.packages("devtools")
}
devtools::install_github("kvittingseerup/IsoformSwitchAnalyzeR", build_vignettes = TRUE)
Cheers Kristoffer
Dear Kristoffer,
thank you for your quick response. I am currently using the latest master branch version:
> packageVersion('IsoformSwitchAnalyzeR')
[1] ‘1.11.3’
Before that I was using 1.8.0 with R 3.6.3.
Thank you for your help!
Cheers, Tobias
Short Update:
Step 1 of 5 : Importing external sequence analysis...
Step 2 of 5 : Analyzing alternative splicing...
Step 3 of 5 : Prediciting functional consequences...
Step 4 of 5 : Making indidual isoform switch plots...
|======================================= | 56%Error in .fun(piece, ...) :
Something went wrong with the gene_id/gene_name selection.
Please make sure your switchAnalyzeRlist was created with the newest version of IsoformSwitchAnalyzeR and try again.
Else contact developer with reproducible example.
In addition: Warning messages:
1: Removed 1 rows containing missing values (geom_segment).
2: Removed 1 rows containing missing values (geom_segment).
Error: with piece 164:
gene_id gene_name condition_1 condition_2 gene_switch_q_value
1 MSTRG.3504 Ddx5 Control 08h 0.03983282
2 MSTRG.3504 Mir3064 Control 08h 0.03983282
switchConsequencesGene comparison switchConsequencesGeneDescription
1 TRUE Control_vs_08h with_consequences
2 TRUE Control_vs_08h with_consequences
outputName rank
1 ont//Control_vs_08h/ 23
2 ont//Control_vs_08h/ 24
I can circumvent this error by excluding exactly this gene_id. Overall these issues seem to be related to the annotation and moreover probably to some cases of merged transcripts.
However, it would be probably good to catch these cases and inform the user as the package already does for many other issues that the data may have.
Just to be sure: Did you re-create the switchAnalyzeRlist (via the importRdata()
function) with IsoformSwitchAnalyzeR 1.9.3? If not could you try that?
If this does not work I would need a switchAnalyzeRlist object with the problematic gene to debug it. Could I persuade you to email me this? It can be created using this code snippet:
subsetSwitchAnalyzeRlist(
<aSwitchList>,
<aSwitchList>$isoformFeatures$gene_id == 'MSTRG.3504'
)
which you can save with with the saveRDS()
function. I will naturally keep it confidential and delete it after I'm done debugging it.
Cheers Kristoffer
Dear Kristoffer,
Thank you for following up.
When I follow your instructions I end up with version 1.11.3
not 1.9.3
. Is there another way of exactly installing that version?
> if (!requireNamespace("devtools", quietly = TRUE)){
+ install.packages("devtools")
+ }
> devtools::install_github("kvittingseerup/IsoformSwitchAnalyzeR", build_vignettes = TRUE)
Skipping install of 'IsoformSwitchAnalyzeR' from a github remote, the SHA1 (5a7ab4af) has not changed since last install.
Use `force = TRUE` to force installation
> devtools::install_github("kvittingseerup/IsoformSwitchAnalyzeR", build_vignettes = TRUE, force=TRUE)
Downloading GitHub repo kvittingseerup/IsoformSwitchAnalyzeR@master
✔ checking for file ‘/tmp/user/1000/RtmpzGpYwY/remotes436424eb0696/kvittingseerup-IsoformSwitchAnalyzeR-5a7ab4a/DESCRIPTION’ (393ms)
─ preparing ‘IsoformSwitchAnalyzeR’:
✔ checking DESCRIPTION meta-information ...
─ cleaning src
─ installing the package to build vignettes
creating vignettes ...
✔ creating vignettes (2m 28.1s)
─ cleaning src
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ looking to see if a ‘data/datalist’ file should be added
─ building ‘IsoformSwitchAnalyzeR_1.11.3.tar.gz’
Installing package into ‘/home/tjakobi/.R/3.6’
(as ‘lib’ is unspecified)
* installing *source* package ‘IsoformSwitchAnalyzeR’ ...
** using staged installation
** libs
gcc -std=gnu99 -I"/usr/share/R/include" -DNDEBUG -fpic -g -O2 -fdebug-prefix-map=/home/jranke/git/r-backports/buster/r-base-3.6.3=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c utils.c -o utils.o
gcc -std=gnu99 -shared -L/usr/lib/R/lib -Wl,-z,relro -o IsoformSwitchAnalyzeR.so utils.o -L/usr/lib/R/lib -lR
installing to /home/tjakobi/.R/3.6/00LOCK-IsoformSwitchAnalyzeR/00new/IsoformSwitchAnalyzeR/libs
** R
** data
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (IsoformSwitchAnalyzeR)
>
> packageVersion('IsoformSwitchAnalyzeR')
[1] ‘1.11.3’
However, the current object I created was completely done with 1.11.3 in order to rule out any other side effects. I am happy to share the data file with you.
Thank you for your help.
Cheers, Tobias
Hi Tobias
Sorry for the confusion - you are right that 1.11.3 is the newest version that I wanted you to try with.
Do you still get the error? If so I would need a switchAnalyzeRlist object with the problematic gene to debug it. Could I persuade you to email me this? It can be created using this code snippet:
subsetSwitchAnalyzeRlist(
<aSwitchList>,
<aSwitchList>$isoformFeatures$gene_id == 'MSTRG.3504'
)
which you can save with with the saveRDS()function. I will naturally keep it confidential and delete it after I'm done debugging it.
Cheers Kristoffer
Did you solve this problem Tobias?
Dear @kvittingseerup,
thank you for following up, sorry for the long delay.
I was only able to "solve" it by specifically removing the problematic entries from the data structure.
I've generated the data file you requested and wrote you a separate mail with the download link.
Cheers, Tobias