SISRS icon indicating copy to clipboard operation
SISRS copied to clipboard

Using Bowtie across multiple nodes, if available

Open BobLiterman opened this issue 7 years ago • 4 comments

Hey @anderspitman,

One major bottleneck with SISRS is that the Bowtie mapping steps occur serially, rather than in parallel. In part, this is because Bowtie itself does not run in MPI mode and therefore cannot fully utilizes multi-node systems.

However, it may be possible to use Python packages (e.g. https://pypi.python.org/pypi/dispy) to distribute parallel jobs across nodes. If the user, for instance, sets a flag for number of processors (already exists) and also number of nodes (putative new flag), if we could figure out a way to adapt the individual Bowtie calls to submit to 'N' jobs to 'N' nodes based on the user flag, we could greatly speed up the mapping process.

This is a major benefit of the Python port, so we wanted to bring it up now as things are getting worked out.

Best, Bob

BobLiterman avatar Nov 20 '17 17:11 BobLiterman

@BobLiterman on cursory inspection, I don't see any reason why we couldn't do this. Where exactly is the mapping code you're referring to in SISRS? You're not talking about mapContigs, right?

anderspitman avatar Nov 21 '17 21:11 anderspitman

The bowtie commands in alignContigs and identifyFixedSites

BobLiterman avatar Nov 22 '17 14:11 BobLiterman

Came across scoop today. Might be a good alternative. Posting here to remember to check it out later.

anderspitman avatar Jan 31 '18 20:01 anderspitman

Looks promising.

BobLiterman avatar Jan 31 '18 20:01 BobLiterman