Metadata editing in pod5 files
Hi, we were running into an error during dorado basecalling (of recovered files after a failed run) due to incorrect sequencing kit:
[2023-12-15 14:35:04.358] [error] Unknown sequencing_kit: FLO-PRO114M
So then we checked our pod5 files and saw the following:
flow_cell_product_code: FLO-PRO114M sequencing_kit: FLO-PRO114M
Is it possible to edit the incorrect sequencing kit somehow?
Hi @DCossey , Yes it is possible to fix your metadata although it's not particularly clean as pod5 files are immutable.
There is a short part of the documentation relating to this, but here's a snippet more tailored to your issue. You need to edit the RunInfo of each read.
import pod5
# New output file for edited data
with pod5.Writer("output.pod5") as writer:
# Read all records
with pod5.Reader("input.pod5") as reader:
# Iterate over immutable ReadRecords
for record in reader:
# Convert to mutable Read
read = record.to_read()
# Edit the value
read.run_info.sequencing_kit = "sequencing_kit_here"
# Write the edited read
writer.add_read(read)
Kind regards, Rich
I followed the example code and revised a pod5 file successfully. However, when I try to check the content using pod5 view, I get this error (the same error if I try to view an untouched pod5):
POD5 has encountered an error: 'Error while processing "output2.pod5''
For detailed information set POD5_DEBUG=1'
What command are you running?
pod5 view "output2.pod5"
Can you try without the quotes please?
Tried and still the same error.
$ pod5 view original.pod5
read_id filename read_number channel mux end_reason start_time start_sample duration num_samples minknow_events sample_rate median_before predicted_scaling_scale predicted_scaling_shift tracked_scaling_scale tracked_scaling_shift num_reads_since_mux_change time_since_mux_change run_id sample_id experiment_id flow_cell_id pore_type
POD5 has encountered an error: 'Error while processing 'original.pod5''
For detailed information set POD5_DEBUG=1'
Can you run the following then:
pod5 --version
POD5_DEBUG=1 pod5 view output2.pod5
And then share the contents of the pod5 .log files that are generated?
ah - this could be an new issue from polars==0.20
Can you please ensure you're using polars==0.19
If not please re-install polars with pip install -U polars~=0.19
Yes my polars==0.20. Note that I have to run pip install -U polars==0.19, otherwise it would say "requirement already satisfied". The error was fixed. Thanks!
Fantastic, sorry about that last issue - we're patching this as we speak