kallisto
kallisto copied to clipboard
long running time for index.idx for snDropSeq by kb ref
Hi there,
I have a question about the following Kallisto code: is it appropriate for generating index and reference files for single nuclear DropSeq?
“kb ref -i index.idx -g t2g.txt -f1 cdna.fa -f2 intron.fa -c1 cdna_t2c.txt -c2 intron_t2c.txt --workflow nucleus Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz Homo_sapiens.GRCh38.107.gtf.gz”
I was able to successfully generate t2g.txt, cdna.fa, intron.fa, cdna_t2c.txt and intron-t2c.txt files on my home computer. However, I guess because I chose the –workflow as nucleus, the index.idx took so long over 24 hours and was still running not completed yet.
I attempted to use instead the kallisto index through the ENSEMBL cdna.all file as those for bulk-seq, I found out that the output from “kb count” gave same size for spliced.mhx and unspliced.mhx, which I am suspicious, probably problems with the wrong index file.
I also attempted to use “-n 8” according to one of the kallisto bus tutorial to split the index files into 8 pieces, however, my version of “kb ref” does not recognize “-n”.
Many Thanks!
Jing Jing Liu
The command looks correct. Running the nucleus index takes a lot of RAM, so your home computer may not have enough memory.
ok I see, I will then switch to hpc cluster, thanks!