sourmash icon indicating copy to clipboard operation
sourmash copied to clipboard

update sourmash sig export to export to CSV

Open ctb opened this issue 4 years ago • 3 comments

https://github.com/dib-lab/2020-ibd/blob/master/scripts/sig_to_csv.py

perhaps do this for import too?

ctb avatar Jul 13 '20 15:07 ctb

It would be really great if a script like this could be integrated into sourmash sig, with command line parameters to include or remove abundances. Also would be handy to figure out how to deal with multiple kmer sizes -- does ksize become a column where this information is recorded (probably best practice) vs. ksize needing to be selected with a command line param a la compare

taylorreiter avatar Oct 05 '22 15:10 taylorreiter

I use csvs all the time to R things like make upset plots, run random forests, and generate rarefaction curves

upset plots: https://github.com/sourmash-bio/sourmash/issues/1234#issuecomment-1055709316 upset plots: https://github.com/Arcadia-Science/2022-prjna853785-sourmash/blob/main/notebooks/20220812-visualize-sourmash-signature-intersect.ipynb rarefaction curves: https://github.com/Arcadia-Science/2022-mtx-not-in-mgx-pairs/blob/ter/specaccum/scripts/calc_rarecurves.R rarefaction curve viz: https://github.com/Arcadia-Science/2022-mtx-not-in-mgx-pairs/blob/ter/specaccum/notebooks/20220929-visualize-most-converged-rarefaction-curves.ipynb

taylorreiter avatar Oct 05 '22 15:10 taylorreiter

thought: probably want to upgrade import as well, so that it can take in the same format as export!

ctb avatar Oct 16 '22 21:10 ctb