xpore icon indicating copy to clipboard operation
xpore copied to clipboard

error in datapre step

Open huawen-poppy opened this issue 9 months ago • 7 comments

Hello, thanks for your great tool!

Recently I am trying to run xpore on my data, however, there is an error stating that: File "pandas/_libs/lib.pyx", line 2411, in pandas._libs.lib.maybe_convert_numeric ValueError: Unable to parse string "114.817,115.635,109.092,117.816,123.814,102.277" at position 0

could you please help me fix this problem? Thank you very much!

huawen-poppy avatar Sep 19 '23 14:09 huawen-poppy

Hi @huawen-poppy,

Do you mind sharing the command you use and head all the inputs (eventalign.txt, gtf, and fasta) you use, please?

Thanks!

Best wishes, Yuk Kei

yuukiiwa avatar Sep 21 '23 00:09 yuukiiwa

Thank you for your response! I was using the command xpore dataprep --eventalign eventalign.txt --gtf_or_gff CC7.gtf --transcript_fasta aip.genome_models.no_isoforms.no_duplication.mRNA.fa --out_dir ./output --n_process 32 I figured out the error sourced form the eventalign file, in which I have an extra column containing the strings '114.817,115.635,109.092,117.816,123.814,102.277'. Now I deleted the extra column. But it comes with another error: `/home/zhonh0b/miniconda3/envs/epigenetic/lib/python3.8/site-packages/xpore/scripts/dataprep.py:21: PerformanceWarning: indexing past lexsort depth may impact performance. pos_end += eventalign_result.loc[index]['line_length'].sum() /home/zhonh0b/miniconda3/envs/epigenetic/lib/python3.8/site-packages/xpore/scripts/dataprep.py:21: PerformanceWarning: indexing past lexsort depth may impact performance. pos_end += eventalign_result.loc[index]['line_length'].sum() /home/zhonh0b/miniconda3/envs/epigenetic/lib/python3.8/site-packages/xpore/scripts/dataprep.py:21: PerformanceWarning: indexing past lexsort depth may impact performance. pos_end += eventalign_result.loc[index]['line_length'].sum() /home/zhonh0b/miniconda3/envs/epigenetic/lib/python3.8/site-packages/xpore/scripts/dataprep.py:21: PerformanceWarning: indexing past lexsort depth may impact performance. pos_end += eventalign_result.loc[index]['line_length'].sum() /home/zhonh0b/miniconda3/envs/epigenetic/lib/python3.8/site-packages/xpore/scripts/dataprep.py:72: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy chunk_split['line_length'] = np.array(lines)`

The header of eventalign.txt file is image

the header of the gtf file is: image

the header of the fasta file is: image

Could you please help me sovle this problem? Thanks!

huawen-poppy avatar Sep 23 '23 11:09 huawen-poppy

Hi @huawen-poppy,

To convert the transcript position to genome position, you will have to include the --genome flag too:

xpore dataprep --eventalign eventalign.txt --gtf_or_gff CC7.gtf --transcript_fasta aip.genome_models.no_isoforms.no_duplication.mRNA.fa --genome --out_dir ./output --n_process 32

Do you mind sharing the full error message from xpore diffmod, please?

Thanks!

Best wishes, Yuk Kei

yuukiiwa avatar Sep 25 '23 09:09 yuukiiwa

Hi. Thanks for your reply! I am still running the xpore diffmod process. So far there is no error messages. The head of the diffmod.table looks like: image

Do you think I should cancel the current job and run the xpore dataprep process with adding flag --genome?

huawen-poppy avatar Sep 25 '23 09:09 huawen-poppy

Hi @huawen-poppy,

If you don't need to convert the transcript coordinates to genomic coordinates, then you don't need the --genome , --gtf_or_gff, and --transcript_fasta flags.

I have just noticed that the sequences of your fasta file are not capitalized. If you need transcript-to-genomic coordinate conversion, you can try capitalizing them.

Thanks!

Best wishes, Yuk Kei

yuukiiwa avatar Oct 02 '23 03:10 yuukiiwa

Hi,

I have encountered " PerformanceWarning" as well. I still can have eventalign.index generated. Does it affect result?

Thanks!

Andrea

Thank you for your response! I was using the command xpore dataprep --eventalign eventalign.txt --gtf_or_gff CC7.gtf --transcript_fasta aip.genome_models.no_isoforms.no_duplication.mRNA.fa --out_dir ./output --n_process 32 I figured out the error sourced form the eventalign file, in which I have an extra column containing the strings '114.817,115.635,109.092,117.816,123.814,102.277'. Now I deleted the extra column. But it comes with another error: `/home/zhonh0b/miniconda3/envs/epigenetic/lib/python3.8/site-packages/xpore/scripts/dataprep.py:21: PerformanceWarning: indexing past lexsort depth may impact performance. pos_end += eventalign_result.loc[index]['line_length'].sum() /home/zhonh0b/miniconda3/envs/epigenetic/lib/python3.8/site-packages/xpore/scripts/dataprep.py:21: PerformanceWarning: indexing past lexsort depth may impact performance. pos_end += eventalign_result.loc[index]['line_length'].sum() /home/zhonh0b/miniconda3/envs/epigenetic/lib/python3.8/site-packages/xpore/scripts/dataprep.py:21: PerformanceWarning: indexing past lexsort depth may impact performance. pos_end += eventalign_result.loc[index]['line_length'].sum() /home/zhonh0b/miniconda3/envs/epigenetic/lib/python3.8/site-packages/xpore/scripts/dataprep.py:21: PerformanceWarning: indexing past lexsort depth may impact performance. pos_end += eventalign_result.loc[index]['line_length'].sum() /home/zhonh0b/miniconda3/envs/epigenetic/lib/python3.8/site-packages/xpore/scripts/dataprep.py:72: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy chunk_split['line_length'] = np.array(lines)`

The header of eventalign.txt file is image

the header of the gtf file is: image

the header of the fasta file is: image

Could you please help me sovle this problem? Thanks!

AndreaYCT avatar Mar 07 '24 01:03 AndreaYCT

Hi @AndreaYCT,

This is warning that doesn't affect the results.

Thanks!

Best wishes, Yuk Kei

yuukiiwa avatar Mar 07 '24 21:03 yuukiiwa