giraffe performance
Hello,
I encountered the following warning while running the Vg Giraffe workflow:
......
Not counting CPU instructions because perf events are unavailable: No such file or directory
warning[vg::Watchdog]: Thread 20 has been checked in for 10 seconds processing: HWI-D00258:298:HT27CBCXX:1:1101:1173:2223, HWI-D00258:298:HT27CBCXX:1:1101:1173:2223
warning[vg::Watchdog]: Thread 20 finally checked out after 11 seconds and 0 kb memory growth processing: HWI-D00258:298:HT27CBCXX:1:1101:1173:2223, HWI-D00258:298:HT27CBCXX:1:1101:1173:2223
warning[vg::Watchdog]: Thread 20 has been checked in for 10 seconds processing: HWI-D00258:298:HT27CBCXX:1:1101:2219:2239, HWI-D00258:298:HT27CBCXX:1:1101:2219:2239
warning[vg::Watchdog]: Thread 20 finally checked out after 13 seconds and 0 kb memory growth processing: HWI-D00258:298:HT27CBCXX:1:1101:2219:2239, HWI-D00258:298:HT27CBCXX:1:1101:2219:2239
warning[vg::Watchdog]: Thread 20 has been checked in for 10 seconds processing: HWI-D00258:298:HT27CBCXX:1:1101:2107:2242, HWI-D00258:298:HT27CBCXX:1:1101:2107:2242
warning[vg::Watchdog]: Thread 20 finally checked out after 31 seconds and 0 kb memory growth processing: HWI-D00258:298:HT27CBCXX:1:1101:2107:2242, HWI-D00258:298:HT27CBCXX:1:1101:2107:2242
warning[vg::Watchdog]: Thread 20 has been checked in for 10 seconds processing: HWI-D00258:298:HT27CBCXX:1:1101:2485:2210, HWI-D00258:298:HT27CBCXX:1:1101:2485:2210
warning[vg::Watchdog]: Thread 20 finally checked out after 13 seconds and 0 kb memory growth processing: HWI-D00258:298:HT27CBCXX:1:1101:2485:2210, HWI-D00258:298:HT27CBCXX:1:1101:2485:2210
......
And Giraffe showed low performance during execution. How can I resolve this issue?
Thank you.
What are you trying to do, exactly? How did you build the graph, based on what data? What are the reads? How did you build the indexes and how are you running Giraffe, using which commands? What kind of a system you are running Giraffe on?
Dear sir @jltsiren , @sdws1983 and I from below commands to build the graph, and mapping ngs data.
#build graph pangenome
vg autoindex -T ./ --workflow giraffe -r $fa "${VCF_ARGS[@]}" -p $prefix
# mapping
vg giraffe -t 10 -p -Z 22.anchorwave.giraffe.gbz -m 22.anchorwave.min -d 22.anchorwave.dist -f 1803_1.fq.gz -f 1803_2.fq.gz
The command are correct; however, we have encountered an issue with our servers. We have an older server and a newly configured server from this year. Despite the new server's superior performance, "vg giraffe" runs slower on it and appears to be incompatible. Specifically, on the old server, a 24-hour run with 10 threads can produce an 80GB gam file, but on the new server, a 50-hour run with 5 threads only yields an 186MB gam file,too slow. The old server output log Xuan-15.gam.log The new server output log(tool larger, head -n 5000 ) Xuan-15.gam_head-5000.log
system information
How should I use vg giraffe on the new server?
If you have any further questions or need additional information, please don't hesitate to contact me at any time.
version of VG ,
Is it a problem with the
vm.overcommit_memory setting?
One common issue is the distance index, which is a memory-mapped file. Memory-mapped files do not work well on some network drives. You can try making the distance index file read-only to see if that solves the issue. And if that fails, you could try copying the distance index to a local drive and using that copy with Giraffe.
Thanks for your quickly response. I will have a try.
you could try copying the distance index to a local drive and using that copy with Giraffe.
The engineer resolved the server issue based on this information,Thanks for your help, I run Giraffe successfully.
Dear sir @jltsiren ,I have encountered another issue on the new server. I have constructed a pan-genome using minigraph-cactus with the default index, which includes 97samples.d2.dist, 97samples.d2.gbz, and 97samples.d2.min. Some samples are capable of achieving alignment, while others are not. When aligning samples, some samples are stuck (the alignment is not completed, and the size of the gam file remains unchanged), and the running status of vg is 'S'. I'm not sure what caused this. Could you please advise me on how to resolve this issue?
vg version
vg [warning]: System's vm.overcommit_memory setting is 2 (never overcommit). vg does not work well under these conditions; you may appear to run out of memory with plenty of memory left. Attempting to unsafely reconfigure jemalloc to deal better with this situation.
vg version v1.53.0 "Valmontone"
Compiled with g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 on Linux
Linked against libstd++ 20230528
Built by [email protected]
I should take a look at this issue https://github.com/vgteam/vg/issues/4020. Please wait for my feedback. Thank you for your patience.
Status S means that the main thread is suspended. Other threads are still running, which means they are still doing some work.
Since the Giraffe processes need only a few gigabytes of memory, you should try running the problematic samples on another system. Maybe even on your laptop. If that run also gets stuck, the issue is probably with the graph and/or the reads. If not, it's probably a system issue on the server.
I also noticed two things: First, the server has a lot of free memory, but very little of it is used for caching files. At the same time, it is using tens of gigabytes of swap. I don't know what would cause a situation like that, but it could interfere with the memory-mapped distance index.
There are a total of 97 samples. In the default filtering step of Minigraph-Cactus, nodes with a haplotype count of less than or equal to 2 are removed (97samples.d2.dist). Inspired by the human pangenome paper, I constructed a 97samples.d10.dist index for pangenome. Finally, I completed successfully.
vg giraffe -t 10 -p -Z 97samples.d10.gbz -m 97samples.d10.min -d 97samples.d10.dist -f 7164_1.clean.fastq.gz -f 7164_2.clean.fastq.gz 1>7164.gam 2>7164.log
Thanks for your help.