AmpliconArchitect
AmpliconArchitect copied to clipboard
A problem about "ValueError: cannot convert float NaN to integer"
Hello,I meet a question when I run AA for getting data.
The AmpliconArchitect version is 1.2
my command: python /home/tang/tools/AmpliconArchitect/src/AmpliconArchitect.py --bed ../NCI-H716_AA_CNV_SEEDS.bed --bam ../NCI-H716_hg38_align_sort_rmdup.bam --downsample 0 --out NCI-H716_ecDNA --ref GRCh38
The process:
[root:INFO] #TIME 6.673 Loading libraries and reference annotations for: GRCh38
Global ref name is GRCh38
[root:INFO] #TIME 16.012 Initiating bam_to_breakpoint object for: ../NCI-H716_hg38_align_sort_rmdup.bam
[root:INFO] #TIME 16.013 Exploring interval: chr10 121465452 121845452
Traceback (most recent call last):
File "/home/tang/tools/AmpliconArchitect/src/AmpliconArchitect.py", line 214, in
What should I do to solve this question? Thanks!
Hi,
I have not seen this error before. Could you please upload any log files produced by AA. Could you also clarify which aligner was used to generate the BAM file?
Thanks, Jens
I used bwa 0.7.17
and all the log content:
INFO:root:Commandline: /home/tang/tools/AmpliconArchitect/src/AmpliconArchitect.py --bed ../NCI-H716_AA_CNV_SEEDS.bed --bam ../NCI-H716_hg38_align_sort_rmdup.bam --downsample 0 --out NCI-H716_ecDNA --ref GRCh38
INFO:root:AmpliconArchitect version 1.2
INFO:root:#TIME 7.035 Loading libraries and reference annotations for: GRCh38
INFO:root:#TIME 16.572 Initiating bam_to_breakpoint object for: ../NCI-H716_hg38_align_sort_rmdup.bam
INFO:root:#TIME 16.573 Exploring interval: chr10 121465452 121845452
DEBUG:root:#TIME 16.595 interval_hops: init chr10 121465452 121845452
DEBUG:root:#TIME 52.786 discordant edges chr10 121845452 121855452
DEBUG:root:#TIME 52.943 discordant edges: fetch discordant chr10 121845452 121855452 942 702
DEBUG:root:#TIME 52.943 discordant edges: discordant read pairs found: chr10 121845452 121855452 942 702
INFO:root:Commandline: /home/tang/tools/AmpliconArchitect/src/AmpliconArchitect.py --bed ../NCI-H716_AA_CNV_SEEDS.bed --bam ../NCI-H716_hg38_align_sort.bam --downsample 0 --out NCI-H716_ecDNA --ref GRCh38
INFO:root:AmpliconArchitect version 1.2
INFO:root:#TIME 6.915 Loading libraries and reference annotations for: GRCh38
INFO:root:#TIME 16.398 Initiating bam_to_breakpoint object for: ../NCI-H716_hg38_align_sort.bam
INFO:root:#TIME 60.490 Exploring interval: chr10 121465452 121845452
DEBUG:root:#TIME 60.511 interval_hops: init chr10 121465452 121845452
DEBUG:root:#TIME 103.041 discordant edges chr10 121845452 121855452
DEBUG:root:#TIME 103.337 discordant edges: fetch discordant chr10 121845452 121855452 1404 1084
DEBUG:root:#TIME 103.337 discordant edges: discordant read pairs found: chr10 121845452 121855452 1404 1084
Hi,
I believe the issue is that somehow NaN is assigned for the mean read_length or max_insert size. Can you please try the following.
- Always remove temporary results from the output folder before re-running AA in same folder. It appears AA was run twice in the same folder with different BAM files.
- Please clear the contents of the file "coverage.stats" from your AA_DATA_REPO. Do not delete the file, but please clear its contents.
- Is the BAM file coordinate sorted?
Best, Jens
According your advice,I checked my bam file,and found a mistake that I used double R1.fq file when I aligned.
I'm a newbie in this field,I have read a bit of literature in related fields,and I found that the complete sequence of ecDNA is not given in those literature.May I ask question: Is there any way to accurately get the sequence of ecDNA? If yes,how to get the data about complete sequence of ecDNA?
Thank you very much! ZhengTang
Hi ZhengTang,
No problem. With NGS data alone, we can only make bioinformatic predictions of ecDNA regions given breakpoint graphs. The confidence of those predictions increases when FISH data or other cytogenetics are also present to validate. Lastly, long-range sequencing can aid in validating the structures suggested by NGS, or in some cases can be used on their own to derive a structure. The task of identifying if ecDNA is present and the problem of resolving its structure are actually quite different. Detecting ecDNA presence from NGS alone is an easier problem than resolving the complete sequence of the ecDNA amplicon.
If you would like to make predictions of ecDNA regions based on your NGS data, please see AmpliconClassifier.
Best, Jens
I get it. But I’m curious if there is an experimental method to isolate ecDNA for analysis?In this way if we can get the complete structure of ecDNA? Thank you very much!
Hi,
Some groups have had success isolating ecDNA using the Circle-Seq technique, for example in the following publications: https://www.nature.com/articles/s41588-019-0547-z https://www.nature.com/articles/s41588-020-0678-2
Best regards, Jens