split_pod5 supported seed types error
When running:
$ duplex_tools --version
Duplex Sequencing Tools 0.3.2
$ python --version
Python 3.11.0
$ duplex_tools split_pairs --threads 2 dorado.moves.bam OESO_152_LSK114/20230322_1606_3G_PAO27011_7b4991d0/ OESO_152_LSK114_pod5s_splitduplex/
I get the following exception:
[07:46:49 - SplitPairs] Finished finding breakpoints.
[07:46:49 - SplitPairs] Splitting 47685 reads from: OESO_152_LSK114/20230322_1606_3G_PAO27011_7b4991d0/ into OESO_152_LSK114_pod5s_splitduplex/
[07:46:49 - SplitPairs] Splitting OESO_152_LSK114/20230322_1606_3G_PAO27011_7b4991d0/pass.pod5
^M0it [00:00, ?it/s]^M0it [00:11, ?it/s]
Traceback (most recent call last):
File "/lustre/scratch126/casm/team154pc/mp15/duplex-tools.venv/bin/duplex_tools", line 33, in <module>
sys.exit(load_entry_point('duplex-tools==0.3.2', 'console_scripts', 'duplex_tools')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lustre/scratch126/casm/team154pc/mp15/duplex-tools.venv/lib/python3.11/site-packages/duplex_tools/__init__.py", line 39, in main
args.func(args)
File "/lustre/scratch126/casm/team154pc/mp15/duplex-tools.venv/lib/python3.11/site-packages/duplex_tools/split_pairs.py", line 134, in main
split_pairs(
File "/lustre/scratch126/casm/team154pc/mp15/duplex-tools.venv/lib/python3.11/site-packages/duplex_tools/split_pairs.py", line 53, in split_pairs
_ = split_pod5(
^^^^^^^^^^^
File "/lustre/scratch126/casm/team154pc/mp15/duplex-tools.venv/lib/python3.11/site-packages/duplex_tools/split_pairs_steps.py", line 138, in split_pod5
rd.seed(read.read_id)
File "/software/python-3.11.0/lib/python3.11/random.py", line 160, in seed
raise TypeError('The only supported seed types are: None,\n'
TypeError: The only supported seed types are: None,
int, float, str, bytes, and bytearray.
Hi @mp15, thanks for raising this.
It's likely that this is a new issue coming from a change in how the random module works. Can you try with python 3.10 and see if the issue persists? Will add this as a to-do for 3.11
I can confirm that the same data under Python 3.10.1 works so def some 3.11 oddity for your list.
Hi, I am using duplex_tools v0.3.2 install via pip on python 3.11 as well, it is giving the same error message as above for the split_pairs run. I hacked the python code "split_pairs_steps.py" to output the class for the read ID just before that read ID goes into the seed(read_id) function in the message above, and these are snippets of the output from that hacking.
9ab59979-3afb-4e3b-a184-056572922db4 <class 'uuid.UUID'> 432a940c-0168-4a63-a1dc-4c509e4a6d3d <class 'uuid.UUID'> 8506dcc0-4a95-4a5c-b21b-761e6a51f1a2 <class 'uuid.UUID'> 9c6599a0-b6eb-486f-93ae-1677e2062ce8 <class 'uuid.UUID'>
So in my pod5 files the read_id from pysam is class 'uuid.UUID' which is not supported by seed in the error message.
Can I just remove this read_ID seed and let seed randomise the seed ie seed().