fastp icon indicating copy to clipboard operation
fastp copied to clipboard

How to use the location of 2-4bp as UMI of reads at 5' ?

Open biofer opened this issue 2 years ago • 2 comments

Thanks for your great work。 I want to remove the begin 1 bp and use the 2-4 bp as UMI , did fastp support this?

fastp ... \
      --trim_front1 1  --trim_front1 1 \
      -U --umi_loc per_read --umi_prefix UMI --umi_len 3 --umi_skip 3 

and it use the 1-3bp as UMI。

biofer avatar Jul 05 '22 06:07 biofer

No. As described in the readme UMIs are extracted before global trimming. But fastp is nice, just run it twice:

fastp ... \
  --trim_front1 1  --trim_front1 1 \
  --stdout | fastp --stdin \
  -U --umi_loc per_read --umi_prefix UMI --umi_len 3 --umi_skip 3 

Even in case fastp trimmed before extracting UMIs and restarted counting bases after global trimming, this makes your intentions explicit and it easier for others to follow your code. Note, that fastp will most likely never be the bottleneck of your analysis. Alternatively you could just keep the first base as part of the UMI and, if necessary, remove it later.

jan-glx avatar Mar 12 '23 13:03 jan-glx

See also #447

jan-glx avatar Mar 12 '23 13:03 jan-glx