bcftools icon indicating copy to clipboard operation
bcftools copied to clipboard

bcftools 1.20 view command seems use all available CPUs despite the --threads 1 argument

Open yangyxt opened this issue 4 months ago • 6 comments

I installed bcftools via conda and it seems uses all available CPUs when executing a normal bcftools view command despite I tried to limit the threads number with "--threads 1".

Below is a snapshot of the htop interface, as you can see that one bcftools view opens tons of subprocesses despite the --threads 1 command.

Image

Really eager to understand why this happened. Please take a look at your convenience. The OS system is CentOS7

yangyxt avatar Sep 09 '25 07:09 yangyxt

I can't see the command, but possibly you gave it many files on the command line?

Using -threads 0 disables multi-threading completely and is probably what you want.

-threads N sets up a thread pool to share between tasks, which includes decoding bgzf files (vcf.gz or bcf). This is done asynchronously so there is one I/O thread per file in addition to the CPU thread pool. This is basically to permit the OS to do reading in the background, as permitted by the system I/O scheduler.

Hence -threads 1 will be using 1 worker thread for CPU - hence 100% cpu utilisation - and potentially many I/O threads which will be bottlenecked on the CPU so essentially just idle, not even waiting on I/O. We see this with 0.0% CPU utilisation across the board.

jkbonfield avatar Sep 09 '25 08:09 jkbonfield

I can't see the command, but possibly you gave it many files on the command line?

Using -threads 0 disables multi-threading completely and is probably what you want.

-threads N sets up a thread pool to share between tasks, which includes decoding bgzf files (vcf.gz or bcf). This is done asynchronously so there is one I/O thread per file in addition to the CPU thread pool. This is basically to permit the OS to do reading in the background, as permitted by the system I/O scheduler.

Hence -threads 1 will be using 1 worker thread for CPU - hence 100% cpu utilisation - and potentially many I/O threads which will be bottlenecked on the CPU so essentially just idle, not even waiting on I/O. We see this with 0.0% CPU utilisation across the board.

Thank you very much for the swift response. I just run bcftools view --threads 1 <a_single_vcf_path> and the snapshot is what I got. I just tried --threads 0 and it still opens a tons of IO threads. And the main command uses up to 400% CPU according to the snapshot below:

Image

yangyxt avatar Sep 09 '25 09:09 yangyxt

That's definitely odd. I don't see how with --threads 0 (or no --threads option at all) it can ever use more than 100% CPU. What is your full command line, and is it including pipes or other sub-processes somehow in the htop report?

jkbonfield avatar Sep 11 '25 10:09 jkbonfield

That's definitely odd. I don't see how with --threads 0 (or no --threads option at all) it can ever use more than 100% CPU. What is your full command line, and is it including pipes or other sub-processes somehow in the htop report?

Hi, sorry for late reply. Have been sick for several days. The full command is simple as bcftools view -H (-H is not important as the issue happens multiple times with or without -H)

Image

If my memory is accurate, this issue started to emerge when I upgraded the bcftools to 1.20. Again, I'm running on a CentOS 7 HPC.

yangyxt avatar Sep 16 '25 13:09 yangyxt

That's definitely odd. In the code we have

args->n_threads = 0;
...
if ( args->n_threads > 0)
    hts_set_opt(args->out, HTS_OPT_THREAD_POOL, args->files->p);

so no threading is done unless set explicitly.

Is it reproducible? Does it happen when the --threads option is not given as well? Did you compile the program from source?

pd3 avatar Oct 13 '25 08:10 pd3

A practical workaround to control CPU usage when bcftools jobs ignore the --threads is to manage CPU affinity externally, at the system level, instead of relying on --threads (which only limits BGZF compression/decompression and not the process CPU affinity).

You can create a small CPU pool file (e.g. /tmp/core_pool.$USER) listing the available cores, one per line (e.g. 0–15). Each job, before running bcftools, locks the pool using flock, takes one available core ID, and executes the command with taskset -c bcftools ...

When the job finishes, the core ID is released back to the pool.

This ensures that each job runs strictly single-threaded and stays pinned to its assigned CPU core, regardless of bcftools’ internal thread handling or the system scheduler.

You can then integrate this wrapper into a ParaFly command file (or any other job scheduler) and simply control total concurrency like ParaFly -CPU N, this way, you will have exactly N cores being used

This approach guarantees deterministic CPU usage and prevents oversubscription on shared servers or HPC nodes, even when bcftools itself does not fully respect thread limits.

I did this and it worked here

LuizCarlosMachado avatar Oct 21 '25 20:10 LuizCarlosMachado