MAESTRO
MAESTRO copied to clipboard
Chromap, Less than 5% barcodes can be found or corrected based on the barcode whitelist
Hi,
I am using the MAESTRO to analyze a scATAC dataset downloaded form SRA database (accession number: SRR10399252), but I met this error:
Output file: Result/Mapping/SRR10399252_epilepsy/fragments_pre_corrected_dedup_count.tsv
Loaded all sequences successfully in 12.35s, number of sequences: 195, number of bases: 3099922541.
Kmer size: 17, window size: 7.
Lookup table size: 393150044, occurrence table size: 444597151.
Loaded index successfully in 30.40s.
Loaded 737280 barcodes in 1.45s.
Loaded sequence batch successfully in 0.82s, number of sequences: 500000, number of bases: 8000000.
Less than 5% barcodes can be found or corrected based on the barcode whitelist.
Please check whether the barcode whitelist matches the data, e.g. length, reverse-complement. If this is a false positive warning, please run Chromap with the option --skip-barcode-check.
I have also tried the minimap2 for mapping, but also got error:
[Sat Jan 8 17:29:03 2022]
rule scatac_mergepeak:
input: Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_all_peaks.narrowPeak
output: Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_final_peaks.bed
jobid: 7
benchmark: Result/Benchmark/SRR12130207_Lega_42_PeakMerge.benchmark
wildcards: sample=SRR12130207_Lega_42
[Sat Jan 8 17:29:04 2022]
Error in rule scatac_mergepeak:
jobid: 7
output: Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_final_peaks.bed
shell:
cat Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_all_peaks.narrowPeak | sort -k1,1 -k2,2n | cut -f 1-4 > Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_cat_peaks.bed
mergeBed -i Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_cat_peaks.bed | grep -v '_' | grep -v 'chrEBV' > Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_final_peaks.bed
rm Result/Analysis/SRR12130207_Lega_42/SRR12130207_Lega_42_cat_peaks.bed
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
I can successfully run the test data provided by MAESTRO, so I do not know whether it is due to the scATAC-seq data itself. Thanks in advance!
Best regards, Min
Did you used Chromap custom format for barcode? If yes, this just got fixed here and would work in Chromap next release.