sourmash
sourmash copied to clipboard
update sourmash sig export to export to CSV
https://github.com/dib-lab/2020-ibd/blob/master/scripts/sig_to_csv.py
perhaps do this for import too?
It would be really great if a script like this could be integrated into sourmash sig
, with command line parameters to include or remove abundances. Also would be handy to figure out how to deal with multiple kmer sizes -- does ksize become a column where this information is recorded (probably best practice) vs. ksize needing to be selected with a command line param a la compare
I use csvs all the time to R things like make upset plots, run random forests, and generate rarefaction curves
upset plots: https://github.com/sourmash-bio/sourmash/issues/1234#issuecomment-1055709316 upset plots: https://github.com/Arcadia-Science/2022-prjna853785-sourmash/blob/main/notebooks/20220812-visualize-sourmash-signature-intersect.ipynb rarefaction curves: https://github.com/Arcadia-Science/2022-mtx-not-in-mgx-pairs/blob/ter/specaccum/scripts/calc_rarecurves.R rarefaction curve viz: https://github.com/Arcadia-Science/2022-mtx-not-in-mgx-pairs/blob/ter/specaccum/notebooks/20220929-visualize-most-converged-rarefaction-curves.ipynb
thought: probably want to upgrade import as well, so that it can take in the same format as export!