xpore
xpore copied to clipboard
Errors while running xpore dataprep
Dear GoekeLab,
I am trying to run xpore on the cluster of our institute, everythings goes well using the demo data, however I got this error/warning while running xpore dataprep with my own data, by chance do you have any ideas of the causes and how to fix it ?
Error. nthreads cannot be larger than environment variable "NUMEXPR_MAX_THREADS" (64)/home/mycomputer/.local/ lib/python3.7/site-packages/xpore-2.1-py3.7.egg/xpore/scripts/dataprep.py:21: PerformanceWarning: indexing past lexsort depth may impact performance. pos_end += eventalign_result.loc[index]['line_length'].sum() /home/mycomputer/.local/lib/python3.7/site-packages/xpore-2.1-py3.7.egg/xpore/scripts/dataprep.py:72: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
Best regards,
Jeremy
Hi Jeremy,
Thanks for reaching out! It will be great if you can provide the command you used for running xpore dataprep
. Other than that, you can also look into the following two things:
- After you see this error/warning, was
xpore dataprep
still generating thedataprep/data.json
file (see whether it increases in size byls -lh dataprep/data.json
)? If yes,xpore dataprep
is still running fine. - What is the value you put in for
--n_processes
? Is this value larger than yourenvironment variable "NUMEXPR_MAX_THREADS"
? If yes, you might want to either change--n_processes
to a smaller value or increase the value of yourenvironment variable "NUMEXPR_MAX_THREADS"
Best wishes, Yuk Kei
Hi Yuk Kei,
I’m working with Jeremy on running xpore dataprep.
Here is the command I used for running xpore data prep:
xpore dataprep \
--eventalign “eventalign_Araport11_GTF_genes_transposons-col0.txt" \
--gtf_or_gff “Araport11_GTF_genes_transposons_final_xpore.sorted.gtf" \
--transcript_fasta “Araport11_GTF_genes_transposons.fa" \
--out_dir dataprep \
--genome
After seeing the error/warning, xpore dataprep only generated the eventalign.index file. No other output files are generated when I try to run xpore dataprep.
Best, Erika
Hi Erika,
Thank you for the information! Do you mind showing me the head
of eventalign_Araport11_GTF_genes_transposons-col0.txt
, Araport11_GTF_genes_transposons_final_xpore.sorted.gtf
, and Araport11_GTF_genes_transposons.fa
, please? I am suspecting that this might be due to a customized gtf
file.
Thanks!
Best wishes, Yuk Kei
Hi Yuk Kei,
Here is the head for the eventalign.txt, GTF, and FASTA files.
eventalign_Araport11_GTF_genes_transposons-col0.txt:
contig position reference_kmer read_index strand event_index event_level_mean event_stdv event_length model_kmer model_meamodel_stdv standardized_level start_idx end_idx
AT1G01020.2 426 TTCTG 29 t 429 78.67 1.821 0.00664 TTCTG 79.59 2.07 -0.36 29062 29082
AT1G01020.2 426 TTCTG 29 t 430 82.91 1.990 0.00332 TTCTG 79.59 2.07 1.32 29052 29062
AT1G01020.2 427 TCTGA 29 t 431 95.35 1.866 0.00232 TCTGA 91.37 2.85 1.15 29045 29052
AT1G01020.2 427 TCTGA 29 t 432 99.25 1.877 0.00631 TCTGA 91.37 2.85 2.27 29026 29045
AT1G01020.2 427 TCTGA 29 t 433 94.57 2.016 0.00266 TCTGA 91.37 2.85 0.92 29018 29026
AT1G01020.2 427 TCTGA 29 t 434 98.04 1.761 0.00797 TCTGA 91.37 2.85 1.92 28994 29018
AT1G01020.2 428 CTGAT 29 t 435 122.09 3.429 0.00730 CTGAT 111.64 4.49 1.91 28972 28994
AT1G01020.2 428 CTGAT 29 t 436 117.08 2.426 0.00299 CTGAT 111.64 4.49 0.99 28963 28972
AT1G01020.2 429 TGATT 29 t 437 136.43 6.966 0.00266 TGATT 127.73 5.10 1.40 28955 28963
Araport11_GTF_genes_transposons_final_xpore.sorted.gtf:
1 Araport11 transcript 3631 5899 . + . gene_id "AT1G01010"; transcript_id "AT1G01010.1";
1 Araport11 exon 3631 3913 . + . gene_id "AT1G01010"; transcript_id "AT1G01010.1";
1 Araport11 exon 3996 4276 . + . gene_id "AT1G01010"; transcript_id "AT1G01010.1";
1 Araport11 exon 4486 4605 . + . gene_id "AT1G01010"; transcript_id "AT1G01010.1";
1 Araport11 exon 4706 5095 . + . gene_id "AT1G01010"; transcript_id "AT1G01010.1";
1 Araport11 exon 5174 5326 . + . gene_id "AT1G01010"; transcript_id "AT1G01010.1";
1 Araport11 exon 5439 5899 . + . gene_id "AT1G01010"; transcript_id "AT1G01010.1";
1 Araport11 exon 6788 7069 . - . gene_id "AT1G01020"; transcript_id "AT1G01020.2";
1 Araport11 exon 6788 7069 . - . gene_id "AT1G01020"; transcript_id "AT1G01020.6";
1 Araport11 exon 6788 7069 . - . gene_id "AT1G01020"; transcript_id "AT1G01020.1";
Araport11_GTF_genes_transposons.fa:
>AT1G01010.1
AAATTATTAGATATACCAAACCAGAGAAAACAAATACATAATCGGAGAAATACAGATTACAGAGAGCGAG
AGAGATCGACGGCGAAGCTCTTTACCCGGAAACCATTGAAATCGGACGGTTTAGTGAAAATGGAGGATCA
AGTTGGGTTTGGGTTCCGTCCGAACGACGAGGAGCTCGTTGGTCACTATCTCCGTAACAAAATCGAAGGA
AACACTAGCCGCGACGTTGAAGTAGCCATCAGCGAGGTCAACATCTGTAGCTACGATCCTTGGAACTTGC
GCTTCCAGTCAAAGTACAAATCGAGAGATGCTATGTGGTACTTCTTCTCTCGTAGAGAAAACAACAAAGG
GAATCGACAGAGCAGGACAACGGTTTCTGGTAAATGGAAGCTTACCGGAGAATCTGTTGAGGTCAAGGAC
CAGTGGGGATTTTGTAGTGAGGGCTTTCGTGGTAAGATTGGTCATAAAAGGGTTTTGGTGTTCCTCGATG
GAAGATACCCTGACAAAACCAAATCTGATTGGGTTATCCACGAGTTCCACTACGACCTCTTACCAGAACA
TCAGAGGACATATGTCATCTGCAGACTTGAGTACAAGGGTGATGATGCGGACATTCTATCTGCTTATGCA
Thank you, Erika
Hi Erika,
Thank you for sharing the eventalign.txt, GTF, and FASTA files! Those should be compatible with xpore dataprep
.
I think you should look into the first line of the error message Error. nthreads cannot be larger than environment variable "NUMEXPR_MAX_THREADS"
, which contacting the cluster maintainers of your institute will help.
Thanks!
Best wishes, Yuk Kei
hello Yuk Kei
I am also trying to use xpore dataprep and Encountered the same problem,the dataprep/eventalign.index
is generating, but data.json
, data.index
, data.log
and data.readcount
is empty, I have no idea about it and may I ask for your help?
The command I running xpore dataprep
is
xpore dataprep \
--eventalign data/${file}/nanopolish/eventalign.txt \
--gtf_or_gff all.gtf \
--transcript_fasta ref.fa \
--out_dir data/${file}/dataprep \
--genome
I got error
/mycomputer/miniconda3/lib/python3.9/site-packages/xpore-2.1-py3.9.egg/xpore/scripts/dataprep.py:21: PerformanceWarning: indexing past lexsort depth may impact performance.
pos_end += eventalign_result.loc[index]['line_length'].sum()
/mycomputer/miniconda3/lib/python3.9/site-packages/xpore-2.1-py3.9.egg/xpore/scripts/dataprep.py:72: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
And my eventalign.txt, GTF, and FASTA all seem like @erika-fukuhara, do you solve this problem or have any suggestion?
Thank you! Jeffer
Hey,
I'm having the same problem. I run xpore dataprep but the data.json data.log and other files are empty.
Do you know how we can fix it?