Hi, I am merging two .pairs.gz files, but met this problem:
$ pairtools merge --max-nmerge 7 --nproc 8 --memory 100G --tmpdir ./ --output ./merge_KO.pairs.gz ./*.gz
Traceback (most recent call last):
File "/home/unix/bai/mambaforge/envs/microc/bin/pairtools", line 11, in
sys.exit(cli())
File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/pairtools/cli/merge.py", line 134, in merge
merge_py(
File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/pairtools/cli/merge.py", line 194, in merge_py
merged_header = headerops.merge_headers(headers)
File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/pairtools/lib/headerops.py", line 730, in merge_headers
new_pairheader = _merge_pairheaders(pairheaders, force=False)
File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/pairtools/lib/headerops.py", line 686, in _merge_pairheaders
chroms_merged = merge_chrom_lists(*chrom_lists)
File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/pairtools/lib/headerops.py", line 576, in merge_chrom_lists
chrom_list = list(_toposort(g.copy(), tie_breaker=min))
File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/pairtools/lib/headerops.py", line 561, in _toposort
raise ValueError("Circular dependencies exist: {} ".format(list(dag.items())))
ValueError: Circular dependencies exist: [('chr10', {'chr9'}), ('chr11', {'chr10'}), ('chr12', {'chr11'}), ('chr13', {'chr12'}), ('chr14', {'chr13'}), ('chr15', {'chr14'}), ('chr16', {'chr15'}), ('chr17', {'chr16'}), ('chr18', {'chr17'}), ('chr19', {'chr18'}), ('chr2', {'chr19'}), ('chr20', {'chr19', 'chr2'}), ('chr21', {'chr20'}), ('chr22', {'chr21'}), ('chr3', {'chr22', 'chr2'}), ('chr4', {'chr3'}), ('chr5', {'chr4'}), ('chr6', {'chr5'}), ('chr7', {'chr6'}), ('chr8', {'chr7'}), ('chr9', {'chr8'}), ('chrM', {'chr9', 'chrY'}), ('chrX', {'chr22', 'chrM'}), ('chrY', {'chrX'})]
I ran into this as well while validating a pipeline with some test data. It seems to occur when the .pairs
files input to the merge
command have been merged more than once. In my pipeline, I merge two .pairs
files representing technical replicates into a single .pairs
file representing a biological replicate, and then merge three .pairs
files representing biological replicates into a final condition.pairs
file. This creates the circular dependency error. Eliminating one of the technical replicates also eliminates the circular dependency issue.
Adding the --keep-first-header argument eliminated the error, so I am doing that for the time being.