seqan3 icon indicating copy to clipboard operation
seqan3 copied to clipboard

Multi-threading, banding, other speed-ups

Open jaysunl opened this issue 1 year ago • 3 comments

Platform

  • SeqAn version: 3
  • Operating system: Linux raptor.ucsd.edu 5.4.0-149-generic #166-Ubuntu SMP Tue Apr 18 16:51:45 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  • Compiler: gcc (Ubuntu 13.1.0-8ubuntu1~20.04.2) 13.1.0

Question

Can someone explain how to use multi-threading to gain a significant speed-up? My multi-threaded version seems to be slow than without multi-threading. An example of a code snippet helps (maybe with fasta files would help but a vector example also works). I tried following the example in the docs and the speed didn't improve for me.

jaysunl avatar May 19 '24 21:05 jaysunl

Hi @jaysunl can you please specify what you are trying to do? The best way to do this would be to give a minimal working example of what you are parallelizing and how you are doing it. Best regards

rrahn avatar May 21 '24 08:05 rrahn

I tried following the example in the docs and the speed didn't improve for me.

Looks like you forgot to reference the example you tried out?

eseiler avatar May 21 '24 09:05 eseiler

Yes apologies, I tried these examples: multi-threading with callback

    using namespace seqan3::literals;
    using sequence_pair_t = std::pair<seqan3::dna4_vector, seqan3::dna4_vector>;
    auto start = std::chrono::high_resolution_clock::now();

    std::vector<sequence_pair_t> sequences{100000, {"CAGGCATGAGCCACTACTCCTGTTTTTTAGAGGATATAGATAGAATGGATCCTGTGTCCCATAATAAATTAAGGGCAACTTGTCACACCCCTTCCATACAAAGACTGAATCAGCAGACACCACAGCCAAATCAGAGGGAAGGATGGCATGGGCTTGCTTGGTTAAGCAACAGAATAACAGCAATAATAACATAAATATAATTGCAATTTATGAGTTCTTGTTATTTGCCAGGTTCTGTAATTAATGCCATCATTAC"_dna4, 
    "AATACCTGTTTTTAGAGGTATAGTAATAGAGTAGATGTGCCTCCCATAATAAATAGGGCTACTTGTACAAATACCCACCTTCCAACAAAGGACCTAATCAGCAGACACAAGAGCCAAAGCAGAGCGAAGGAATGCACATGGGCTTAGCTTGTAAAGCAAAGAGTAACAGCAAAAAATCATAAATTAAATTTCCAATTTAGGTTCATTTCATTGCCAGGTATCGAATCAATGGCTGATATTACTATCTACTTTTTGT"_dna4}};

    auto alignment_config = seqan3::align_cfg::method_global{} 
                                | seqan3::align_cfg::scoring_scheme{
                                  seqan3::nucleotide_scoring_scheme{}}  
                                | seqan3::align_cfg::gap_cost_affine{} 
                                | seqan3::align_cfg::output_score{} 
                                | seqan3::align_cfg::output_alignment{}
                                | seqan3::align_cfg::parallel{4};
    std::mutex write_to_debug_stream{};
    auto const alignment_config_with_callback = alignment_config |
                                                seqan3::align_cfg::on_result{[&] (auto && result)
                                                {
                                                    std::lock_guard sync{write_to_debug_stream}; // critical section
                                                    //seqan3::debug_stream << result << '\n';
                                                }};
    seqan3::align_pairwise(sequences, alignment_config_with_callback);

and then multi-threading without callback

    using namespace seqan3::literals;
    using sequence_pair_t = std::pair<seqan3::dna4_vector, seqan3::dna4_vector>;
    auto start = std::chrono::high_resolution_clock::now();

    std::vector<sequence_pair_t> sequences{100000, {"CAGGCATGAGCCACTACTCCTGTTTTTTAGAGGATATAGATAGAATGGATCCTGTGTCCCATAATAAATTAAGGGCAACTTGTCACACCCCTTCCATACAAAGACTGAATCAGCAGACACCACAGCCAAATCAGAGGGAAGGATGGCATGGGCTTGCTTGGTTAAGCAACAGAATAACAGCAATAATAACATAAATATAATTGCAATTTATGAGTTCTTGTTATTTGCCAGGTTCTGTAATTAATGCCATCATTAC"_dna4, 
    "AATACCTGTTTTTAGAGGTATAGTAATAGAGTAGATGTGCCTCCCATAATAAATAGGGCTACTTGTACAAATACCCACCTTCCAACAAAGGACCTAATCAGCAGACACAAGAGCCAAAGCAGAGCGAAGGAATGCACATGGGCTTAGCTTGTAAAGCAAAGAGTAACAGCAAAAAATCATAAATTAAATTTCCAATTTAGGTTCATTTCATTGCCAGGTATCGAATCAATGGCTGATATTACTATCTACTTTTTGT"_dna4}};

    auto alignment_config = seqan3::align_cfg::method_global{} 
                                | seqan3::align_cfg::scoring_scheme{
                                  seqan3::nucleotide_scoring_scheme{}}  
                                | seqan3::align_cfg::gap_cost_affine{} 
                                | seqan3::align_cfg::output_score{} 
                                | seqan3::align_cfg::output_alignment{}
                                | seqan3::align_cfg::parallel{4};
    seqan3::align_pairwise(sequences, alignment_config);

and then standard sequential procedure:

    using namespace seqan3::literals;
    using sequence_pair_t = std::pair<seqan3::dna4_vector, seqan3::dna4_vector>;
    auto start = std::chrono::high_resolution_clock::now();

    std::vector<sequence_pair_t> sequences{100000, {"CAGGCATGAGCCACTACTCCTGTTTTTTAGAGGATATAGATAGAATGGATCCTGTGTCCCATAATAAATTAAGGGCAACTTGTCACACCCCTTCCATACAAAGACTGAATCAGCAGACACCACAGCCAAATCAGAGGGAAGGATGGCATGGGCTTGCTTGGTTAAGCAACAGAATAACAGCAATAATAACATAAATATAATTGCAATTTATGAGTTCTTGTTATTTGCCAGGTTCTGTAATTAATGCCATCATTAC"_dna4, 
    "AATACCTGTTTTTAGAGGTATAGTAATAGAGTAGATGTGCCTCCCATAATAAATAGGGCTACTTGTACAAATACCCACCTTCCAACAAAGGACCTAATCAGCAGACACAAGAGCCAAAGCAGAGCGAAGGAATGCACATGGGCTTAGCTTGTAAAGCAAAGAGTAACAGCAAAAAATCATAAATTAAATTTCCAATTTAGGTTCATTTCATTGCCAGGTATCGAATCAATGGCTGATATTACTATCTACTTTTTGT"_dna4}};

    auto alignment_config = seqan3::align_cfg::method_global{} 
                                | seqan3::align_cfg::scoring_scheme{
                                  seqan3::nucleotide_scoring_scheme{}}  
                                | seqan3::align_cfg::gap_cost_affine{} 
                                | seqan3::align_cfg::output_score{} 
                                | seqan3::align_cfg::output_alignment{}
       // notice no parallel specification
    seqan3::align_pairwise(sequences, alignment_config);

but all codes ran the same speed, and actually in some cases the parallelism slows down the code. I tried increasing the number of alignments and the thread count but this also doesn't do that much. Also sort of unrelated, but sometimes a local alignment is slower than a global alignment, which is weird to me. In addition, banding also doesn't speed the alignment time as much. Any tips?

jaysunl avatar May 21 '24 16:05 jaysunl