pod5-file-format
pod5-file-format copied to clipboard
Discrepancy Between Parsed Read IDs and Calculated Transfers in pod5 filter Command
Description: I am using the following command to filter data from .pod5 files based on read IDs in a text file:
pod5 filter folder1 folder2 --output filtered.pod5 --ids read_ids.txt
When I run the command, I observe the following output: Parsed 9240344 read_ids from: read_ids.txt Found 25678389 read_ids from 230 inputs Calculated 9215377 transfers
I am trying to understand the cause of the difference in numbers between the parsed read_ids and the calculated transfers. Workflow:
Nanopore Sequencing: I have used nanopore sequencing to generate .pod5 and .fastq files.
Extracted Read IDs: I have extracted the read IDs from the .fastq file.
Filtering: I am using these extracted read IDs to filter data from the .pod5 files using the pod5 filter command.
Specifications
- Pod5 Version: 0.3.23
- Python Version: 3.11.5