pairtools icon indicating copy to clipboard operation
pairtools copied to clipboard

running pairtools merge sequentially produces circular dependencies

Open jiangshan529 opened this issue 1 year ago • 1 comments

Hi, I am merging two .pairs.gz files, but met this problem:

$ pairtools merge --max-nmerge 7 --nproc 8 --memory 100G --tmpdir ./ --output ./merge_KO.pairs.gz ./*.gz Traceback (most recent call last): File "/home/unix/bai/mambaforge/envs/microc/bin/pairtools", line 11, in sys.exit(cli()) File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/click/core.py", line 1157, in call return self.main(*args, **kwargs) File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/pairtools/cli/merge.py", line 134, in merge merge_py( File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/pairtools/cli/merge.py", line 194, in merge_py merged_header = headerops.merge_headers(headers) File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/pairtools/lib/headerops.py", line 730, in merge_headers new_pairheader = _merge_pairheaders(pairheaders, force=False) File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/pairtools/lib/headerops.py", line 686, in _merge_pairheaders chroms_merged = merge_chrom_lists(*chrom_lists) File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/pairtools/lib/headerops.py", line 576, in merge_chrom_lists chrom_list = list(_toposort(g.copy(), tie_breaker=min)) File "/home/unix/bai/mambaforge/envs/microc/lib/python3.9/site-packages/pairtools/lib/headerops.py", line 561, in _toposort raise ValueError("Circular dependencies exist: {} ".format(list(dag.items()))) ValueError: Circular dependencies exist: [('chr10', {'chr9'}), ('chr11', {'chr10'}), ('chr12', {'chr11'}), ('chr13', {'chr12'}), ('chr14', {'chr13'}), ('chr15', {'chr14'}), ('chr16', {'chr15'}), ('chr17', {'chr16'}), ('chr18', {'chr17'}), ('chr19', {'chr18'}), ('chr2', {'chr19'}), ('chr20', {'chr19', 'chr2'}), ('chr21', {'chr20'}), ('chr22', {'chr21'}), ('chr3', {'chr22', 'chr2'}), ('chr4', {'chr3'}), ('chr5', {'chr4'}), ('chr6', {'chr5'}), ('chr7', {'chr6'}), ('chr8', {'chr7'}), ('chr9', {'chr8'}), ('chrM', {'chr9', 'chrY'}), ('chrX', {'chr22', 'chrM'}), ('chrY', {'chrX'})]

jiangshan529 avatar Oct 18 '23 01:10 jiangshan529

I ran into this as well while validating a pipeline with some test data. It seems to occur when the .pairs files input to the merge command have been merged more than once. In my pipeline, I merge two .pairs files representing technical replicates into a single .pairs file representing a biological replicate, and then merge three .pairs files representing biological replicates into a final condition.pairs file. This creates the circular dependency error. Eliminating one of the technical replicates also eliminates the circular dependency issue.

Adding the --keep-first-header argument eliminated the error, so I am doing that for the time being.

bskubi avatar Jan 20 '24 01:01 bskubi