rgi icon indicating copy to clipboard operation
rgi copied to clipboard

Trouble with kmer_query command: empty .json.fasta files

Open ZoeHansen opened this issue 3 years ago • 2 comments

Hello! Thank you for all your work curating RGI and these many useful tools! I have successfully run my metagenomic assemblies through rgi main and now have the rgi_main.json files on hand. I have been trying to run the rgi kmer_query command on these RGI output files, but am having some trouble.

Here is my script:

while IFS= read -r i || [[ -n "$i" ]]
do 
rgi load --wildcard_annotation $HOME/rgi/wildcard_database_v3.0.8.fasta \
    --wildcard_index $HOME/rgi/wildcard/index-for-model-sequences.txt \
    --card_annotation $HOME/rgi/card_database_v3.1.0.fasta \
    --kmer_database $HOME/rgi/wildcard/61_kmer_db.json \
    --amr_kmers $HOME/rgi/wildcard/all_amr_61mers.txt --kmer_size 61\
    --local

rgi kmer_query --rgi --kmer_size 61 \
    --threads 50 --minimum 10 \
    --input /mnt/gs18/scratch/users/hansenzo/RGI_assemblies/$i.rgi.annotation.json \
    --output /mnt/gs18/scratch/users/hansenzo/RGI_assemblies/$i.rgi_kmer_classifier \
    --local 

echo "RGI k-mer classifier complete: $i"
done < $HOME/ERIN_samples_IDs.txt;

My script runs smoothly without error. However, when I go to investigate the output, the summary.txt files from the kmer analysis only contain a header row. Upon looking further, the rgi.annotation.json.fasta files that are created with the kmer_query command are empty. Is there any reason these fasta files may not be being written? The original RGI .json and .txt files are pretty large, and so I'm not sure if they're not being read properly or if something else is wrong.

I tried re-running this command with the --debug flag included, and a sample of that output is included here: RGI_kmer_query_debug_output.txt

Any insight is greatly appreciated! Thank you!

ZoeHansen avatar Jul 01 '21 13:07 ZoeHansen

@ZoeHansen RGI now has an auto_load function. Can you try that instead of load as i'm thinking maybe one of the files is not formatted correctly. Try rgi auto_load and let us know. Cheers.

raphenya avatar Oct 27 '21 17:10 raphenya

@ZoeHansen Any update on this?

raphenya avatar Feb 25 '22 19:02 raphenya

New to this so I apologize if this issue has an obvious answer. I am having the same exact issue with the kmer_query command as described above. I am using card_database_v3.2.5 and wildcard_database_v4.0.0.fasta. When I auto_load the reference data, the formatting (folders, subfolders, files) is different than when I manually load and clean the reference files. Is there a set of instructions of how to format the auto loaded files to match where the bwt command and kmer_query command can find them? The load command does not seem to recognize the autoloaded card.json file as well. Thanks!

StephanieAFlowers avatar Jan 18 '23 16:01 StephanieAFlowers

Issue is stale and will be closed in 7 days unless there is new activity

github-actions[bot] avatar Oct 17 '23 11:10 github-actions[bot]