Jason Stajich

Results 165 comments of Jason Stajich

okay in looking deeper. keeping whole scaffold in memory might be excessive - this might be better if we use one of the DBIndex tools to retrieve the slide -...

Before we try to fix i think need to know if this is high IO because exonerate is being run 1000 times and reopening a bunch of files (eg 1000...

both of those sound smart. threading and the index - it might be good to see if we can figure out how to test some benchmarking in there. but maybe...

very nice - do you want me to post some test numbers from centos and see? I can also test with SSD tmpdir vs network tmpdir.

you can set a --score option for filtering but it doesn't compute evalue. There's a '--best' or '--bestn' option too but it may add computationally I am not sure -...

there may be some options around Diamond doing realignment of the HSPs but not sure if maybe computing a total % aligned of the query might help also speedup (make...

it sounds like you need to run singularity commands on your system - usually you need to prefix any commands with `singularity exec $path 'funannotate'` here's examples from our system:...

Those locale setting variables look like are problem for perl runs. Set those LANGUAGE and LC_ALL Variables to en_US And test that perl alone works and not sure if Using...

yeah it is a short python/perl script to get longest from the protein file eg here's a perl script that works for ensembl formatted headers https://github.com/hyphaltip/genome-scripts/blob/master/seq/get_longest_peptide.pl

If there aren't -T2 then the duplications you see in the gene trees relate to gene level duplications not isoforms. But as to how this is detected it really depends...