TALON using existing tmp files

using existing tmp files

Open mei-du opened this issue 2 years ago • 2 comments

Hi, I've been having a lot of difficulties running TALON successfully on my very large dataset. After 3 days, I got all tmp files except abundance_tuples.tsv. Another whole day passed but there were no visible changes to the tmp files or any other output files. I'm not sure if this means that the jobs have actually finished.

I've so far altered talon.py to keep and use interval_files, which saves ~8 hours. I also used the existing tmp files to update the database and it did proceed to writing the output reads file which started writing into reads.tmp, but I eventually ran into an error where a read's strand value was 'None'. This also happened with talon_fetch_reads.

I was actually able to run all the post-processing filtering steps on the manually updated database as I am after abundance and the GTF but once again, not sure if the annotation jobs actually completed. Could you offer any insights or solutions to this, such as making use of existing tmp files?

Any help would be greatly appreciated, thanks!

Nov 01 '21 03:11 mei-du

Just checked my reference GTF and it doesn't have missing/'None' strand values, and I don't have the same problem as in #65. There also doesn't seem to be missing strand values when looking in DB browser but will keep looking.

Nov 01 '21 05:11 mei-du

Hi there,

I have been running TALON on large datasets recently and I usually am able to get around these issues by running it on subsets of the data in succession.

Dec 03 '21 05:12 fairliereese

TALON TALON copied to clipboard

using existing tmp files

TALON
TALON copied to clipboard