rnaseq icon indicating copy to clipboard operation
rnaseq copied to clipboard

Save the Salmon and RSEM reference index in addition to the STAR index when running with --save_reference

Open amizeranschi opened this issue 2 years ago • 2 comments

Description of feature

When running the pipeline with --save_reference, currently only the STAR index is saved. It would be useful to also include the Salmon and RSEM index inside the ./genome/index/ directory.

amizeranschi avatar Jul 25 '23 17:07 amizeranschi

Hi @amizeranschi ! We need a much better solution to deal with reference genomes overall. We have been talking about this for a while and ideally, we write a separate workflow that generates all of the genome assets required by the pipeline.

drpatelh avatar Oct 15 '23 11:10 drpatelh

I'm trying to run the pipeline on my laptop. I have the RSEM index generated already, but the pipeline fails as I don't have enough ram (It needs 72, I have 64 GB). However, its very frustrating as I do NOT want to use RSEM - I'm running star_salmon so I don't understand why I still need the RSEM index. Saving indexes will really be helpful to run smaller runs locally.

kvn95ss avatar Oct 25 '23 15:10 kvn95ss

@amizeranschi what Harshil says is true. But the pipeline should already be saving e.g. the Salmon index - not just the STAR one.

Note that this would only apply when a Salmon index is generated. This only happens when pseudo_aligner is set an Salmon is run in mapping-based mode, which is not the default behaviour.

RSEM_PREPAREREFERENCE_GENOME will not run unless aligner is set to star_rsem and params.rsem_index is not set.

I believe the issue in the OP might have been due to a misunderstanding about when the Salmon index is generated, and this is not in fact an ongoing issue, so I'm going to close. Please feel free to reopen if I am mistaken.

pinin4fjords avatar May 29 '24 15:05 pinin4fjords