pairtools sort issue
Hello,
I'm running v1.1.0 and trying to run this pipeline per the Omni-C documentation and I keep running into this error:
pairtools sort --tmpdir=/path/to/temp --nproc 36 parsed.pairsam > sorted.pairsam
Traceback (most recent call last):
File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/bin/pairtools", line 11, in <module>
sys.exit(cli())
^^^^^
File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/pairtools/cli/__init__.py", line 183, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/pairtools/cli/sort.py", line 128, in sort
sort_py(
File "/mnt/sas0/AD/rem2/.conda/envs/pairtools/lib/python3.11/site-packages/pairtools/cli/sort.py", line 220, in sort_py
for line in body_stream:
File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfe in position 1903: invalid start byte
I tried using iconv but it doesn't look like that worked. Any suggestions?
Hi, since pairtools sort are regularly tested, I would first assume that somethings is wrong with the input. The error looks very specific "...byte 0xfe in position 1903" - could it be that you're dealing with a corrupted file? Have you tried opening it with less and checking if it has reasonable content and if it's terminating properly?
alternatively, please check this thread: https://stackoverflow.com/questions/19699367/for-line-in-results-in-unicodedecodeerror-utf-8-codec-cant-decode-byte
We test pairtools only on Linux and only on .pairs[am] files produced by pairtools parse, in which case utf-8 normally works. Could it be that the input file was generated either on another OS or not by pairtools parse?
ok, closing for now - feel free to reopen if more information comes up