kinfin
kinfin copied to clipboard
plot_cluster_size and generate_network .py errors
Hi. I tried both plotting scripts and both gave errors:
./plot_cluster_size_distribution.py -i ../out1.kinfin_results/cluster_counts_by_taxon.txt -o clustersizedist_out1 -c tab20
[+] Start ...
[+] Plotting "loglogpowerlaw" ...
Traceback (most recent call last):
File "./plot_cluster_size_distribution.py", line 238, in
./generate_network.py -m ../out1.kinfin_results/all/all.all.cluster_metrics.txt -c ~/WorkKinfin/Kinfin1Interproscan/configkinfin.csv -o outdom1 [+] Parsing SpeciesClassification file: /home/antonella/WorkKinfin/Kinfin1Interproscan/configkinfin.csv ... [+] Parsing ../out1.kinfin_results/all/all.all.cluster_metrics.txt ... [-] No column header ending in '_count' found in #cluster_id,cluster_status,cluster_type,cluster_protein_count,cluster_proteome_count,TAXON_protein_count,TAXON_mean_count,non_taxon_mean_count,representation,log2_mean(TAXON/others),pvalue(TAXON vs. others),TAXON_coverage,TAXON_count,non_TAXON_count,TAXON_taxa,non_TAXON_taxa. Please use TAXON.cluster_summary.txt ./generate_network.py -m ../out1.kinfin_results/all/all.cluster_summary.txt -c ~/WorkKinfin/Kinfin1Interproscan/configkinfin.csv -o outdom1 [+] Parsing SpeciesClassification file: /home/antonella/WorkKinfin/Kinfin1Interproscan/configkinfin.csv ... [+] Parsing ../out1.kinfin_results/all/all.cluster_summary.txt ... [-] No column header ending in '_count' found in #cluster_id,cluster_protein_count,protein_median_count,TAXON_count,attribute,attribute_cluster_type,protein_span_mean,protein_span_sd,all_count,all_median,all_cov. Please use TAXON.cluster_summary.txt #I can see several headings having "_count" as ending in these files
With the only file that worked was with the TAXON.cluster_summary.txt (not with the TAXON.*.cluster_metrics.txt)
./generate_network.py -m ../out2.kinfin_results/TAXON/TAXON.cluster_summary.txt -c ~/WorkKinfin/Kinfin1Interproscan/configkinfin2.csv -o out2TAXON [+] Parsing SpeciesClassification file: /home/antonella/WorkKinfin/Kinfin1Interproscan/configkinfin2.csv ... [+] Parsing ../out2.kinfin_results/TAXON/TAXON.cluster_summary.txt ... [+] Max edge weight is 10280, ... [+] Building graphs Name: Graph Type: Graph Number of nodes: 30 Number of edges: 435 Average degree: 29.0000 [+] Saving network out2TAXON.graph.graphml [+] Saving network out2TAXON.graph.gexf
if you post ../out1.kinfin_results/cluster_counts_by_taxon.txt
i can take a look...
re ./generate_network.py
... from the headers it looks like your files are CSVs instead of TSVs ... could that be?
Below is the head of cluster counts by taxon (also attached). And no, for the generate network it's not the format; I tried tsv and txt and the same error appeared. #ID ACYPI ANOGA APOLU BELHE BEMTA CHRCA CIMLE CTEFE CULBR EPIBA GERLA GLOFU GONAC HARAX ISCEL LAOST MACQU MEPSP NEZVI PANGE PLABI PSAAR RHOBR RHOEC RHOPR RPCDC STOCA TRIIN TRIRU VESCR OG0000000 208 53 14 288 71 59 16 130 2 145 205 16 30 122 104 1 226 211 1 24 152 1 1 4 1 4 210 270 3 0 OG0000001 57 67 16 130 34 16 4 6 0 42 239 1 32 41 88 2 286 54 4 12 98 0 1 1 3 3 69 77 3 1 OG0000002 7 0 9 60 2 0 1 0 0 0 97 0 160 2 0 7 11 62 15 207 54 80 147 108 111 5 1 62 5 4 OG0000003 36 55 17 18 47 80 43 39 25 41 25 33 25 54 37 21 68 16 22 22 15 22 18 20 20 18 27 20 15 103 OG0000004 2 0 1 22 0 0 0 1 0 0 5 0 11 0 0 0 0 20 8 752 28 29 5 2 29 0 0 32 4 0 OG0000005 22 11 8 141 5 1 1 35 0 1 93 4 13 11 0 6 24 70 0 79 80 20 20 32 18 16 2 139 4 42 OG0000006 109 9 0 151 10 17 0 100 0 8 47 0 7 24 2 4 70 130 2 1 32 0 0 0 0 5 9 145 0 0 OG0000007 13 2 51 57 17 7 18 30 8 4 77 4 9 4 4 51 43 56 18 38 60 31 33 41 49 27 3 56 16 4 OG0000008 58 6 1 123 1 2 2 6 2 12 74 0 24 23 5 3 7 138 2 44 60 5 9 9 6 1 4 193 3 6
El mar, 9 abr 2024 a las 17:37, Dominik R Laetsch @.***>) escribió:
if you post ../out1.kinfin_results/cluster_counts_by_taxon.txt i can take a look...
re ./generate_network.py ... from the headers it looks like your files are CSVs instead of TSVs ... could that be?
— Reply to this email directly, view it on GitHub https://github.com/DRL/kinfin/issues/49#issuecomment-2045627298, or unsubscribe https://github.com/notifications/unsubscribe-auth/A6E4Q5WHN6G4QS5QCUQMUYTY4QKMNAVCNFSM6AAAAABF6V7H7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBVGYZDOMRZHA . You are receiving this because you authored the thread.Message ID: @.***>
#ID ACYPI ANOGA APOLU BELHE BEMTA CHRCA CIMLE CTEFE CULBR EPIBA GERLA GLOFU GONAC HARAX ISCEL LAOST MACQU MEPSP NEZVI PANGE PLABI PSAAR RHOBR RHOEC RHOPR RPCDC STOCA TRIIN TRIRU VESCR OG0000000 208 53 14 288 71 59 16 130 2 145 205 16 30 122 104 1 226 211 1 24 152 1 1 4 1 4 210 270 3 0 OG0000001 57 67 16 130 34 16 4 6 0 42 239 1 32 41 88 2 286 54 4 12 98 0 1 1 3 3 69 77 3 1 OG0000002 7 0 9 60 2 0 1 0 0 0 97 0 160 2 0 7 11 62 15 207 54 80 147 108 111 5 1 62 5 4 OG0000003 36 55 17 18 47 80 43 39 25 41 25 33 25 54 37 21 68 16 22 22 15 22 18 20 20 18 27 20 15 103 OG0000004 2 0 1 22 0 0 0 1 0 0 5 0 11 0 0 0 0 20 8 752 28 29 5 2 29 0 0 32 4 0 OG0000005 22 11 8 141 5 1 1 35 0 1 93 4 13 11 0 6 24 70 0 79 80 20 20 32 18 16 2 139 4 42 OG0000006 109 9 0 151 10 17 0 100 0 8 47 0 7 24 2 4 70 130 2 1 32 0 0 0 0 5 9 145 0 0 OG0000007 13 2 51 57 17 7 18 30 8 4 77 4 9 4 4 51 43 56 18 38 60 31 33 41 49 27 3 56 16 4 OG0000008 58 6 1 123 1 2 2 6 2 12 74 0 24 23 5 3 7 138 2 44 60 5 9 9 6 1 4 193 3 6
I made some changes which might have fixed it, but I can't tell with only 10 lines ... you let me know if the issue persists and with more lines I can do more.
The errors come from matplotlib
and other libraries having changed over the years and if nobody runs the code, nobody notices that things break over time ...
For the future, can you please check out this guide on how to format markdown text so that your issues are easier to read.
cheers,
dom