Jon Palmer comments

Results 423 comments of


                                            Jon Palmer

Possible IO performance issue

This is with a smallish test set -- but should print timings of the steps in the log file. @Bjoernsen if you are able to test and see if different/better...

Possible IO performance issue

I tested the ThreadPool method -- it was slower because (at least on my Mac) it doesn't use full cpus as it keeps the processes as threads, so you end...

Possible IO performance issue

Ran a test with 6 cpus with uniprot and a fungal genome (this is on my Mac with SSD hard drive): ``` $ time funannotate util prot2genome -g 24266-2.final.fasta -p...

Possible IO performance issue

Sure that would be helpful. I think it's possible to multithreaded the fasta file generation, but would need to split the scaffold into one per file and then split all...

Possible IO performance issue

I mean the diamond/tblastn step. I set this rather low evalue threshold but could do some post filtering, ie in the run about it ran exonerate 250k times to find...

Possible IO performance issue

~300k prelim alignments took > 1.5 hours to just parse the input fasta files on a different (cloud) filesystem today (and only 30 min to run exonerate with 24 cpus)....

Possible IO performance issue

Okay, this should now write the fasta files for exonerate in parallel processes. I've also then exposed a --tmpdir option in `funannotate util prot2genome` and in `funannotate predict` that should...

Possible IO performance issue

This is sufficiently fast for me now (and fixed the race/conflict/unstable code).

Busco scores of predicted proteins much lower than busco of genome

Are you running BUSCO 5.0 with metaeuk or Augustus? I'm not sure the metaeuk method is generating complete/proper gene models, at least metaeuk doesn't do this on its own. Note...

Busco scores of predicted proteins much lower than busco of genome

Are the number of genes being called by funannotate reasonable or is it low?