Benjamin Buchfink

Results 445 comments of Benjamin Buchfink

The cigar field is bugged for `--unal 1`. I have just commited a patch that should fix this.

You should use the `--long-reads` option for long input sequences. Still, this has only been tested for sequences the length up to bacterial chromosomes. I don't think Diamond will currently...

Sorry, it looks like this has escaped me. Yes, conda has the latest version of Diamond.

`--long-reads` is a shortcut for `--range-culling`, `-F 15` and `--top 10`.

How many sequences do your files contain?

You may want to try a smaller block size. Otherwise, I'm not sure why this might crash and need to run some tests, but this may take time.

The memory use should only depend on block size, not file size. You can try to further reduce the block size. Another option to reduce memory use is `--bin`, for...

Running blastx on contigs is a different story, unfortunately the current implementation can't handle very long queries well. Using the frameshift mode (`-F 15`) should work better in these cases.

There seems to be a `.` in your sequence: ``` >g2497.t1 gene=g2497 MNIYTSPTRTPNIAPKSGQRPSLPMLATDERSTDKESPNEDREFVPCSSLDVRRIYPKGPLLVLPEKIYL YSEPTVKELLPFDVVINVAEEANDLRMQVPAVEYHHYRWEHDSQIALDLPSLTSIIHAATTKREKILIHC QCGLSRSATLIIAYIMKYHNLSLRHSYDLLKSRADKINPSIGLIFQLMEWEVALNAKTNVQANSYRKKRS LSSYLSNVSTRREELEKISKQETSEEEDTAGKHEQRETLSEEVSDKFPENVASFRSQTTSVHQATQNNLN AKESEDLAHKNDASSHEGEVNGDSRPDDVPETNEKISQAIRAKISSSSSSPNVRNVDIQNHQPFSRDQLR AMLKEPKRKTVDDFIEEEGLGAVEEEDLSDEVLEKNTTEPENVEKDIEYSDSDKDTDDVGSDDPTAPNSP IKLGRRKLVRGDQLDATTSSMFNNESDSELSDIDDSKNIALSSSLFRGGSSPVKETNNNLSNMNSSPAQN PKRGSVSRSNDSNKSSHIAVSKRPKQKKGIYRDSGGRTRLQIACDKGKYDVVKKMIEEGGYDINDQDNAG NTALHEAALQGHIEIVELLIENGADVNIKSIEMFGDTPLIDASANGHLDVVKYLLKNGADPTIRNAKGLT AFESVDDESEFDDEEDQKILREIKKRLSIAAKKWTNRAGIHNDKSKNGNNAHTIDQPPFDNTTKAKNEKA ADSPSMASNIDEKAPEEEFYWTDVTSRAGKEKLFKASKEGHLPYVGTYVENGGKIDLRSFFESVKCGHED ITSIFLAFGFPVNQTSRDNKTSALMVAVGRGHLGTVKLLLEAGADPTKRDKKGRTALYYAKNSIMGITNS EEIQLIENAINNYLKKHSEDNNDDDDDDDNNNETYKHEKKREKTQSPILASRRSATPRIEDEEDDTRMLN LADDDFNNDRDVKESTTSDSRKRLDDNENVGTQYSLDWKKRKTNALQDEEKLKSISPLSMEPHSPKKAKS VEISKIHEETAAEREARLKEEEEYRKKRLEKKRKKEQELLQKLAEDEKKRIEEQEKQKVLEMERLEKATL EKARKMEREKEMEEISYRRAVRDLYPLGLKIINFNDKLDYKRFLPLYYFVDEKNDKFVLDLQVMILLKDI DLLSKDNQPTSEKIPVDPSHLTPLWNMLKFIFLYGGSYDDKKNNMENKRYVVNFDGVDLDTKIGYELLEY KKFVSLPMAWIKWDNVVIENHAKRKEIEGNMIQISINEFARWRNDKLNKAQQPTRKQRSLKIPRELPVKF...

A `*` should already be ignored or treated as a stop. I'm not aware that a `.` is also used to encode a stop. An option to ignore certain characters...