2020plus
2020plus copied to clipboard
Require snvboxGenes.fa file
Hello,
I am using the 2020+ tool to identify the potential candidate driver gene. I can download the required files, such as snvboxGenes.bed
or scores.tar.gz
, but I am not able to get the exact file required for gene.fa
in the following command:
mut_annotate --summary -i genes.fa -b genes.bed -s score_dir -m mutations.txt -o summary.txt
I tried various fasta files generated from UCSC Table Brower, but now of them worked. Can you share the exact fasta file you used in your published work? Thanks.
Hi. The snvboxGenes.fa file (i.e. input of -i in mut_annotate) is generated from the extract_gene_seq command (see https://probabilistic2020.readthedocs.io/en/latest/tutorial.html#gene-fasta). One just needs to download the hg19 fasta file from UCSC (http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/hg19.2bit), convert the file from 2bit to fasta format using the twoBitToFa command line tool from UCSC, and then run extract_gene_seq command with hg19.fa and snvboxGenes.bed as input.
I've also attached the snvboxGenes.fa file below as well. snvboxGenes.fa.gz