seaborn icon indicating copy to clipboard operation
seaborn copied to clipboard

Allowing an extra dimension based on the style (markers) in Facegrid or catplot.

Open Mickael01 opened this issue 2 years ago • 1 comments

Hi, First of all, I am using Seaborn almost everyday for two years already and it greatly improved since then ! thank you for your work!

I manage to do directly via matplotlib a 4x4 subplots with 5 categorical columns (x,hue, marker style, col and row) and one numerical column (quantities of interest). I was forced to produce two extras columns to manually create a dodge (can be a jitter also) for my data (x_pos and x_tickpos).

here, you can find the resulting plot. Si_new

However, I am not able to do it with FacetGrid (mapping 2 stripplots with different markers) or catplot or relplot since I can't provide an extra dimension via the marker style. For now the FacetGrid or the catplot can take the following dimension x , y , row , col and hue which is 5 dimensions (or dataframe columns)

here are an example of my data:

Elements Models Thermos Parameters Si Values x_pos x_tickpos
Ni 1NN $E_{coh}$ $\xi$ $S_T$ 0.873224 -0.3125 0
Ni 1NN $E_{coh}$ $\xi$ $S_1$ 0.873205 -0.3125 0
Ni 1NN $E_{coh}$ A $S_T$ 0.126821 0.6875 1
Ni 1NN $E_{coh}$ A $S_1$ 0.126817 0.6875 1
Ni 1NN $E_{coh}$ q $S_T$ 0.000000 1.6875 2
... ... ... ... ... ... ... ...
Au 3NNrac $C_{44}$ A $S_1$ 0.458694 1.3125 1
Au 3NNrac $C_{44}$ q $S_T$ 0.336715 2.3125 2
Au 3NNrac $C_{44}$ q $S_1$ 0.310363 2.3125 2
Au 3NNrac $C_{44}$ p $S_T$ 0.054921 3.3125 3
Au 3NNrac $C_{44}$ p $S_1$ 0.049252 3.3125 3

Basically, I tried with catplot but the function do not take marker style as a possible dimensions. such that I can't assign different markers via the column "Si".

g= sns.catplot(data=loc_df,
                        x="Parameters",
                        y="Values",
        		hue="Elements",
			row="Models",
			col="Thermos",
                        s=20,marker="o",dodge=True,height=8,aspect=0.7)

with the resulting plot :

catplot

I tried with relplot but there is no jitter or dodge options and data are superposed. (remark that is also a problem with matplotlib see my solution).

g= sns.relplot(data=loc_df,
			x="Parameters",
			y="Values",
			hue="Elements",
			row="Models",
			style="Si",
			col="Thermos",s=200,aspect=1.2)

relplot

Would you think that controlling the maker style as hue or size can be good enhancement of catplot function? Or alternatively, add the dodge or jitter options to the relplot function ?

Sincerely Mickaël.

PS: here is the code I produce with matplotlib to create my plot. if it helps in any way.

fig, axs = plt.subplots(4,4,figsize=(16,20),sharex=True,sharey=True)
plt.subplots_adjust(right=0.9)
marker_si = ["o","X"]
Sis = ["$S_1$","$S_T$"] 
for names,grp in loc_df.groupby(by=["Elements","Models","Thermos","Si"]):
	#print(names) 
	for cnt,elt in enumerate(elts):
		if names[0] == elt:
			hue = cnt
		#print(elts[cnt],hue,names[0])
	for cnt,md in enumerate(models):
		if names[1] == md:
			col= cnt
	for cnt,themo in enumerate(thermos):
		if names[2] == themo:
			row= cnt
	for cnt,si in enumerate(Sis):
		if names[3] == si:
			mark=cnt
	axs[row,col].set_ylim(-0.05,1.05)
	if (row == 0) and (col==0) and (mark==0):
		if (hue== 0):
			axs[0,0].plot([0],marker='None',linestyle="None",label="dummytophead")
		label_leg1 = f"{names[0]}"  # hue leg1
	else:
		label_leg1=None
	if (row==1) and (col==0) and (hue==0):
		if (mark==0):
			axs[1,0].plot([0],marker='None',linestyle="None",label="dummyempty")
		label_leg2 =f"{names[3]}"
	else:
		label_leg2=None
	if row==0:
		axs[row,col].scatter(grp["x_pos"],
						 grp["Values"],
						 s=100,
						 color=my_palette_elts[names[0]],
						 marker=marker_si[mark],label=label_leg1,facecolor="none")
	elif row==1:
		axs[row,col].scatter(grp["x_pos"],
						 grp["Values"],
						 s=100,
						 color=my_palette_elts[names[0]],
						 marker=marker_si[mark],label=label_leg2,facecolor="none")
	else:
		axs[row,col].scatter(grp["x_pos"],
						 grp["Values"],
						 s=100,
						 color=my_palette_elts[names[0]],
						 marker=marker_si[mark],label=None,facecolor="none")
	axs[row,col].set_title(f"{names[1]} for {names[2]}")
for ax in axs.ravel():
	ax.set_xticks(grp["x_tickpos"])
	ax.set_xticklabels(grp["Parameters"])
	ax.grid(ls="--",color="k",axis="y",lw=2)
handles1,label1 = axs[0,0].get_legend_handles_labels()
handles2,label2 = axs[1,0].get_legend_handles_labels()
fig.legend(handles1+handles2,["Elements"]+elts+["Sensitivity"]+Sis,
		   loc = 'center right', bbox_to_anchor = (0.05,0.0,1.2,1),
            bbox_transform = plt.gcf().transFigure,frameon=True)
## Set common labels
fig.text(0.5, 0.00, 'SMA-TB Parameters', ha='center', va='center')
fig.text(0.00, 0.5, "Sobol' indices ", ha='center', va='center', rotation='vertical')
fig.tight_layout()

Mickael01 avatar Sep 13 '21 09:09 Mickael01

It's possible that something like this would get added in the future, but it is not something that could be added right now. Track #2429 for updates.

In general I think flexibility is good but also would advise against trying to pack too much information into a single plot. Trying to represent 5 or 6 variables can work, with the right data and approach, but often the result is a plot that conveys less information than it would if it were simpler. That is just some advice.

mwaskom avatar Sep 16 '21 11:09 mwaskom

FWIW, my use-case for wanting a style capability in catplot is not so much to pack a bunch of dimensions in a single plot, but to make it much easier to differentiate among even a small handful of categories.

Using hue and style on the same variable dramatically increases both the number of categories that can be distinguished, and also the ease of distinguishing even a relatively smaller number of categories without needing to resort to a "Digital Color Meter" type app...

zpincus avatar Mar 20 '23 16:03 zpincus

I'm going to close as there are no specific plans to expand the variables that can be shown by catplot and this is a general enough design decision / limitation that it doesn't need an open issue.

Note that relplot can show categorical scatter plots (though not with jitter/swarm) using up to 3 additional dimensions (hue/size/style) and that the objects interface can map all manner of properties and can apply jitter to any mark.

mwaskom avatar Aug 27 '23 20:08 mwaskom