pod5-file-format
pod5-file-format copied to clipboard
Pyarrow error
Hey George,
I have a user getting a strange error. I've attached the issue below, where you can also see some more context.
Any ideas what the issue here might be?
Cheers James
dear James,
thank you for the update, I just tried the newer version. I am getting an error related to the pyarrow package: trace
len(batch.signal[batch_row_index].as_buffer()),
AttributeError: 'pyarrow.lib.LargeListScalar' object has no attribute 'as_buffer'
Originally posted by @lborcard in https://github.com/Psy-Fer/blue-crab/issues/12#issuecomment-2208307232
Based on the error i suspect the file is uncompressed (and hitting an unaccounted for error)... I'm not sure how its possible to end up with an uncompressed file - how were the files created?
I'll keep digging on my side.
If may intervene, i am the user with the error. The pod5 files were generated using Icarust https://github.com/LooseLab/Icarust . They are compatible with dorado (I used it to basecall them).
Ok, I'm not familiar with how Icarust writes pod5 files, but I've completed investigating in the pod5 source and found it is due to a bug with uncompressed pod5 files and the python pod5 bindings.
I have a fix internally that will resolve the issue, and I'll get it out asap.
- George
This makes me ask the obvious question as well. Is pore_type still not used by nanopore software?
I was under the impression here that minknow had started using it. Is this something icarust has decided to use but is not actually a field used yet?
Sequencing runs on the current MinKNOW software do not set the pore type no
Hmmm okay. Thanks.
Ahh okay - @Psy-Fer I'm happy to change the Icarust code to set the Pore Type to "not-set" if that would be useful.
Please make it specifically not_set with an underscore to match that of the current pod5 output.
Feel free to use the test scripts in blue-crab as boilerplate to test if your files are correct.
I'll leave in the R10.4.1 exception to the pore_type so users of older versions of icarust can convert files if they like.
James
I'm in the process of deploying 0.3.12, which contains a fix for the issue of opening raw data from uncompressed pod5 files.
Thanks,
- George
Thanks George.
Hello,
I am getting a similar error to the original poster with an uncompressed pod5 file using pod5 version 0.3.23: POD5 has encountered an error: ''pyarrow.lib.ExtensionScalar' object has no attribute 'as_buffer'
Best,
Richard