ClineHelpR icon indicating copy to clipboard operation
ClineHelpR copied to clipboard

pafscaff

Open mimulusm opened this issue 4 years ago • 6 comments

Hello,

This R package helps me a lot! I am trying to plot an Ideogram with the output of bgc. The species I am working on has a draft genome and I aligned this to the chromosome level before running bgc. Is it possible to plot the output with our fasta data which we mapped the reads or something, instead of pafscaff output file? Alternatively. perhaps I can create a dummy file of .scaffold.tdt in pafscaff. Can you give me details of the contents of the .scaffold.tdt?

Thank your very much in advance!

mimulusm avatar Jul 26 '21 01:07 mimulusm

Hi, the function would need to be modified since it currently requires the .scaffolds.tdt file right now. Here's an example below of what that looks like.

Qry     Scaffold        Description     QryLen  QryStart        QryEnd  Strand  Ref     RefLen  RefStart        RefEnd  RefMap  Identity        Length  Coverage        N       Rank    Inv     Trans
query1.01       1  1 len=51.46 Mb 1.03% 1(+) 74,333:77,337,987; 0.90% 1(-); 8.98% other;      51464084        7953    51441722        +       1       77392008        74333   77337987        Anchored        485780  541756  531399.0        446     1       465116.0        4619909.0
query1.02       1  1 len=43.32 Mb 8.98% 1(+) 5,070:77,338,662; 1.21% 1(-); 12.20% other;      43319783        18524   43288930        +       1       77392008        5070    77338662        Placed  3613688 4005849 3890364.0       571     1       525171.0        5283668.0

It might be a while before we could work on that, so in the meantime if you think you can just spoof that format that might be the fastest way.

tkchafin avatar Jul 26 '21 02:07 tkchafin

Thank you very much. It looks tricky, but I'll try.

mimulusm avatar Jul 26 '21 03:07 mimulusm

Hi,

If you have a fasta file for the reference genome and another fasta file for your query scaffolds, You can run your fasta files through minimap2 to generate a PAF output file, and then use the PAF file as input into PAFScaff. PAFScaff then generates the scaffolds.tdt file that you can use with the Ideograms.

Is that what you had in mind, or am I perhaps misunderstanding what you were wanting to do?

-Bradley

btmartin721 avatar Jul 26 '21 05:07 btmartin721

Thank you for your response, Bradley. Maybe I didn't make myself clear; I mapped the reads for bgc to the draft sequence of the study species with a gff file (almost at the chromosome level) and use this information instead of the reference genome of the other relative species. Perhaps I should also look around other packages to draw an ideogram.

mimulusm avatar Jul 26 '21 06:07 mimulusm

Oh ok. Now I understand.

I'm going to mark this as a requested feature that we will try to get to in the future.

In the meantime, if you don't want to use the PAFScaff files to make the ideogram, you can try to write your own script to use the RIdeogram package (https://cran.r-project.org/web/packages/RIdeogram/vignettes/RIdeogram.html), which is what we used under the hood to make the ideogram plots. In the script you'd just have to get your data into a format that ideogram accepts.

-Bradley

btmartin721 avatar Jul 26 '21 17:07 btmartin721

Thank you very much, Bradley. I was thinking about the rideogram. The ClineHelpR helped me a lot. Thank you again. -Makiko

mimulusm avatar Jul 27 '21 03:07 mimulusm