clustergrammer icon indicating copy to clipboard operation
clustergrammer copied to clipboard

type: 'cat_values' seems excluded from treatment at calc_cat_cluster_breakdown

Open nrgiroux opened this issue 6 years ago • 5 comments

Hi, I have this situation where my data contains numerical categories, giving this error:

Uncaught TypeError: Cannot read property 'length' of undefined at make_cat_breakdown_graph.js:132 at arrayEach (lodash.js:532) at Function.forEach (lodash.js:9357) at make_cat_breakdown_graph (make_cat_breakdown_graph.js:108) at still_hovering (make_dendro_triangles.js:54)

JSON.stringify(inst_cat_info,null,'\t')

{
	"cat-0": {
		"type": "cat_strings",
		"max_abs_val": null,
		"cat_scale": null,
		"cat_hist": {
			"None": 8,
			"High": 2,
			"Low": 2
		}
	},
	"cat-1": {
		**"type": "cat_values",**
		"max_abs_val": 90,
		"cat_hist": null
	},
	"cat-2": {
		"type": "cat_strings",
		"max_abs_val": null,
		"cat_scale": null,
		"cat_hist": {
			"Female": 9,
			"Male": 3
		}
	},
	"cat-3": {
		"type": "cat_strings",
		"max_abs_val": null,
		"cat_scale": null,
		"cat_hist": {
			"White": 12
		}
	}
}

JSON.stringify(cat_breakdown,null,'\t')

[
	{
		"type_name": "drug_efficacy",
		"num_in_clust": 1,
		"bar_data": [
			[
				"cat-0",
				"drug_efficacy: None",
				{
					"num_nodes": 1
				},
				"#dbdb8d",
				1,
				null,
				0.6666666666666666
			]
		]
	},
	{
		"type_name": "race",
		"num_in_clust": 1,
		"bar_data": [
			[
				"cat-2",
				"race: Female",
				{
					"num_nodes": 1
				},
				null,
				1,
				null,
				0.75
			]
		]
	},
	{
		"num_in_clust": 1,
		"bar_data": [
			[
				"cat-3",
				**"undefined: White",**
				{
					"num_nodes": 1
				},
				null,
				1,
				null,
				1
			]
		]
	}
]

I think it is related to another manifestation I am seeing in the demo page with similar data: http://amp.pharm.mssm.edu/clustergrammer/viz_sim_mats/5a81ec1b3a82d369f2be253a/cat_matrix_test.txt see how 'age' is skipped from the Cluster information categories and the titles shifted: image

I am working on this bug on my forked version, but I'd rather have your advice.

Thanks, Richard

nrgiroux avatar Feb 12 '18 21:02 nrgiroux

Hi, we've run into bugs when we have value-based categories before string-based categories. So you could first try putting age as the last category. If that doesn't work I'll have to look more closely.

cornhundred avatar Feb 12 '18 21:02 cornhundred

Hi, thanks for your super support ! I tried to put the numerical category as the last one: as you mentioned it doesn't crash but it's not treated, even if it's within the 'max_cats' limit. So there's no 'Cluster information' for numerical categories.

To me it's like there's a missing

else if (params.viz.cat_info[inst_rc][cat_index].type === 'cat_values')

in calc_cat_cluster_breakdown.js (just my two cents). It depends on what you intended to do with 'cat_values' type.

nrgiroux avatar Feb 13 '18 02:02 nrgiroux

Yes, Clustergrammer does not currently show cluster information for value-based categories. My thinking is that doing this would basically amount to showing a histogram of the values of the value-based categories in a cluster, which is something that we would be interested in implementing eventually and you can definitely try and implement that if you want.

One work around that I've used for this situation is to make the value-based categories normal string based categories by appending a small string, e.g. append 'age-' to get 'age-10', and then manually set the color of the categories using set_cat_color (e.g. manually choose different grey hexcodes to assign to different values). Then you will be able to get the cluster information bar plots (but you may lose out on the ability to correctly reorder your categories). Let us know if that helps.

cornhundred avatar Feb 13 '18 14:02 cornhundred

Yeah, for the value-based histograms I'll try and have a look at it. I was thinking about grouping ranges but the values may not be evenly distributed so it would skew the representation.

nrgiroux avatar Feb 14 '18 02:02 nrgiroux

HI @nrgiroux, value-based category histograms will be implemented in Clustergrammer-GL https://github.com/ismms-himc/clustergrammer-gl/issues/new?assignees=&labels=&template=feature_request.md&title=.

cornhundred avatar Jul 17 '19 05:07 cornhundred