kallisto
kallisto copied to clipboard
Wrong number of fastqs for 10xv1
I have 10X 3' v1 chemistry data with four fastq files:
EBT0_1A_S1_L001_I1_001.fastq.gz # 8 nt per read
EBT0_1A_S1_L001_R1_001.fastq.gz # 98 nt per read
EBT0_1A_S1_L001_R2_001.fastq.gz # 16 nt per read
EBT0_1A_S1_L001_R3_001.fastq.gz # 10 nt per read
The reason I believe this to be v1 is the output of the websummary from 10X

When I run the following command:
/home/dan/.local/lib/python3.7/site-packages/kb_python/bins/linux/kallisto/kallisto bus \
-i /data/sai/kallisto_indices/human/index.idx \
-o /data/lab/datasets/Krishnaswamy_2017_Embryoid_Body_Timecourse/kallisto/EBT0_1A \
-x 10xv1 -t 1 \
EBT0_1A_S1_L001_I1_001.fastq.gz EBT0_1A_S1_L001_R1_001.fastq.gz EBT0_1A_S1_L001_R2_001.fastq.gz EBT0_1A_S1_L001_R3_001.fastq.gz
I get the following error:
Error: Number of files (4) does not match number of input files required by technology 10XV1 (3)
kallisto 0.46.1
I had no issues running velocyto on this data, and when I run bamtofastq on the possorted BAM file, I get these four files as output. Any idea what's going on? I have no idea why I would have a R3 file. Should I ignore it?
If I understand the output correctly, there should be 2 files per sample, usually it's *_L001_R1_001.fastq.gz which is read 1 having cell barcodes, and *_L001_R2_001.fastq.gz which is read 2 with the RNA-seq. So I would just try using these two files as input.