seqkit icon indicating copy to clipboard operation
seqkit copied to clipboard

Script generates an empty file.

Open desmodus1984 opened this issue 3 months ago • 4 comments

Hi, I need to process some fasta files for doing synteny analysis and I used seqkit and it generated empty files. The script is this:

for i in *.fasta
        do
        base=$(basename $i ".fasta")

#Rename sequences
seqkit replace -p "(.*)" --replacement "${base}CHR" ${base}.fasta > ${base}-ren.fasta

head ${base}-ren.fasta

#Sort by length
seqkit sort -l -r ${base}-ren.fasta > ${base}-ren.fasta

#Filter by length min 1M
seqkit seq -m 1000000 -g ${base}-ren.fasta > ${base}-1M.fasta

# Check stats
seqkit stats ${base}-1M.fasta

I saw it first generating some files but then the first became empty. 3198678242 Jun 13 15:51 LWed.fasta 2450740935 Jun 13 16:00 NSch.fasta 2460756707 Jul 11 10:43 MAng.fasta 2495284749 Jul 14 14:27 HLep.fasta 2485064101 Sep 22 12:17 HL2.fasta 1323 Sep 22 17:48 chromsync-prep.sh.save 1463 Sep 22 18:01 chromsync-prep.sh 0 Sep 22 18:05 HL2-ren.fasta 0 Sep 22 18:05 HL2-1M.fasta 0 Sep 22 18:06 HLep-ren.fasta 0 Sep 22 18:06 HLep-1M.fasta 0 Sep 22 18:07 LWed-ren.fasta 0 Sep 22 18:07 LWed-1M.fasta 0 Sep 22 18:08 MAng-ren.fasta 0 Sep 22 18:08 MAng-1M.fasta 0 Sep 22 18:09 NSch-ren.fasta 0 Sep 22 18:09 NSch-1M.fasta

Any reason for this? Is it possible to pipe the different commands to generate the sorted/filtered/renames files?

Thanks;

desmodus1984 avatar Sep 22 '25 22:09 desmodus1984

Here, you overwrite the input files. The output name should be different from the input.

seqkit sort -l -r ${base}-ren.fasta > ${base}-ren.fasta

If you use the option -o , it will be OK for this command (not for others, I'll add input/output files checking).

seqkit sort -l -r ${base}-ren.fasta -o ${base}-ren.fasta

shenwei356 avatar Sep 23 '25 00:09 shenwei356

Can I pipe | any of those commands like rename and sort or sort and filter?

Juan Pablo Aguilar


From: Wei Shen @.> Sent: Monday, September 22, 2025 8:32:39 PM To: shenwei356/seqkit @.> Cc: Aguilar Cabezas, Juan Pablo @.>; Author @.> Subject: [External] Re: [shenwei356/seqkit] Script generates an empty file. (Issue #541)

Use caution with links and attachments. [https://avatars.githubusercontent.com/u/2655946?s=20&v=4]shenwei356 left a comment (shenwei356/seqkit#541)https://github.com/shenwei356/seqkit/issues/541#issuecomment-3321988198

Here, you overwrite the input files. The output name should be different from the input.

seqkit sort -l -r ${base}-ren.fasta > ${base}-ren.fasta

If you use the option -o , it will be OK for this command (not for others, I'll add input/output files checking).

seqkit sort -l -r ${base}-ren.fasta -o ${base}-ren.fasta

— Reply to this email directly, view it on GitHubhttps://github.com/shenwei356/seqkit/issues/541#issuecomment-3321988198, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJWD2VPGM64SWXPNQ3LKOXT3UCIKPAVCNFSM6AAAAACHGWX2ICVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGMRRHE4DQMJZHA. You are receiving this because you authored the thread.Message ID: @.***>

desmodus1984 avatar Sep 23 '25 01:09 desmodus1984

Yes, just have a try.

shenwei356 avatar Sep 23 '25 01:09 shenwei356

Input/output files checking is added, supporting normal paths, relative paths and symbolic links.

$ ls -lh
total 4.0K
lrwxrwxrwx 1 shenwei shenwei  7  9月26日 23:43 symbol_link_to_test.fa -> test.fa
-rw-r--r-- 1 shenwei shenwei 11  9月26日 23:44 test.fa

# same file name
$ seqkit seq test.fa -o test.fa
[ERRO] input and output files cannot be the same

# soft link
$ seqkit seq test.fa -o symbol_link_to_test.fa 
[ERRO] input and output files cannot be the same

# relative path
$ seqkit seq test.fa -o dir/../test.fa 
[ERRO] input and output files cannot be the same

# valid
$ seqkit seq test.fa -o out.fa
$

Please have a try.

shenwei356 avatar Sep 26 '25 15:09 shenwei356