Script generates an empty file.
Hi, I need to process some fasta files for doing synteny analysis and I used seqkit and it generated empty files. The script is this:
for i in *.fasta
do
base=$(basename $i ".fasta")
#Rename sequences
seqkit replace -p "(.*)" --replacement "${base}CHR" ${base}.fasta > ${base}-ren.fasta
head ${base}-ren.fasta
#Sort by length
seqkit sort -l -r ${base}-ren.fasta > ${base}-ren.fasta
#Filter by length min 1M
seqkit seq -m 1000000 -g ${base}-ren.fasta > ${base}-1M.fasta
# Check stats
seqkit stats ${base}-1M.fasta
I saw it first generating some files but then the first became empty. 3198678242 Jun 13 15:51 LWed.fasta 2450740935 Jun 13 16:00 NSch.fasta 2460756707 Jul 11 10:43 MAng.fasta 2495284749 Jul 14 14:27 HLep.fasta 2485064101 Sep 22 12:17 HL2.fasta 1323 Sep 22 17:48 chromsync-prep.sh.save 1463 Sep 22 18:01 chromsync-prep.sh 0 Sep 22 18:05 HL2-ren.fasta 0 Sep 22 18:05 HL2-1M.fasta 0 Sep 22 18:06 HLep-ren.fasta 0 Sep 22 18:06 HLep-1M.fasta 0 Sep 22 18:07 LWed-ren.fasta 0 Sep 22 18:07 LWed-1M.fasta 0 Sep 22 18:08 MAng-ren.fasta 0 Sep 22 18:08 MAng-1M.fasta 0 Sep 22 18:09 NSch-ren.fasta 0 Sep 22 18:09 NSch-1M.fasta
Any reason for this? Is it possible to pipe the different commands to generate the sorted/filtered/renames files?
Thanks;
Here, you overwrite the input files. The output name should be different from the input.
seqkit sort -l -r ${base}-ren.fasta > ${base}-ren.fasta
If you use the option -o , it will be OK for this command (not for others, I'll add input/output files checking).
seqkit sort -l -r ${base}-ren.fasta -o ${base}-ren.fasta
Can I pipe | any of those commands like rename and sort or sort and filter?
Juan Pablo Aguilar
From: Wei Shen @.> Sent: Monday, September 22, 2025 8:32:39 PM To: shenwei356/seqkit @.> Cc: Aguilar Cabezas, Juan Pablo @.>; Author @.> Subject: [External] Re: [shenwei356/seqkit] Script generates an empty file. (Issue #541)
Use caution with links and attachments. [https://avatars.githubusercontent.com/u/2655946?s=20&v=4]shenwei356 left a comment (shenwei356/seqkit#541)https://github.com/shenwei356/seqkit/issues/541#issuecomment-3321988198
Here, you overwrite the input files. The output name should be different from the input.
seqkit sort -l -r ${base}-ren.fasta > ${base}-ren.fasta
If you use the option -o , it will be OK for this command (not for others, I'll add input/output files checking).
seqkit sort -l -r ${base}-ren.fasta -o ${base}-ren.fasta
— Reply to this email directly, view it on GitHubhttps://github.com/shenwei356/seqkit/issues/541#issuecomment-3321988198, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJWD2VPGM64SWXPNQ3LKOXT3UCIKPAVCNFSM6AAAAACHGWX2ICVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGMRRHE4DQMJZHA. You are receiving this because you authored the thread.Message ID: @.***>
Yes, just have a try.
Input/output files checking is added, supporting normal paths, relative paths and symbolic links.
$ ls -lh
total 4.0K
lrwxrwxrwx 1 shenwei shenwei 7 9月26日 23:43 symbol_link_to_test.fa -> test.fa
-rw-r--r-- 1 shenwei shenwei 11 9月26日 23:44 test.fa
# same file name
$ seqkit seq test.fa -o test.fa
[ERRO] input and output files cannot be the same
# soft link
$ seqkit seq test.fa -o symbol_link_to_test.fa
[ERRO] input and output files cannot be the same
# relative path
$ seqkit seq test.fa -o dir/../test.fa
[ERRO] input and output files cannot be the same
# valid
$ seqkit seq test.fa -o out.fa
$
Please have a try.