SAMap how sankey_plot show more than four species

Hi Alec,

How can I display more than four species using a sankey_plot?

Thank you for your help !

Nov 04 '23 16:11 xmChen090

If you have a SAMAP object with four species, you should be able to just pass in a list of the four species IDs into the sankey function. Alternatively, you could try making a chord plot. Could you give me more context about what you're trying to do?

Nov 04 '23 17:11 atarashansky

If you have a SAMAP object with four species, you should be able to just pass in a list of the four species IDs into the sankey function. Alternatively, you could try making a chord plot. Could you give me more context about what you're trying to do?

sankey_plot(MappingTable, align_thr=0.12, species_order = ["gas",'tro','dst',"kdc"])

I pass in a list of the four species IDs, but only “gas” “tro” “dst” display in sankey map. Actually, I want to make a sankey map of seven species, but I failed at sm.run(pairwise=True), probably because the computer is out of memory. I successfully ran through four species, but it cannot display all in sankey_plot. Or how can I modify the parameters in sankey_plot? sankey_plot seems to show only three species at most.

Nov 04 '23 17:11 xmChen090

Can you display screenshot MappingTable.head() and paste it here?

Nov 05 '23 16:11 atarashansky

mapping_scores_example.csv I've had this issue before - here is a minimal set of mapping table scores to reproduce, hope it helps

Jan 17 '24 15:01 dnjst

I had the same issue

Jul 05 '24 16:07 DiracZhu1998

mapping_scores_example.csv I've had this issue before - here is a minimal set of mapping table scores to reproduce, hope it helps

Hi, did you solve this?

Jul 07 '24 13:07 DiracZhu1998

@dnjst @atarashansky I modified the sankey_plot function and it works, but when it comes to more than 3 species, the columns do not purely represent a single species. Some species cell types were messed with and mixed into the another species column.

As for chord plots, when it comes to several species and cell types, it's hard to read the graph if we group them based on species. It would be better to group them based on the mapping, that is homologous cell type group together.

Another way is to draw a heatmap.

import numpy as np import pandas as pd import holoviews as hv hv.extension('bokeh', logo=False) hv.output(size=100)

def sankey_plot2(M, species_order=None, align_thr=0.1, **params): """Generate a sankey plot

Parameters
----------
M: pandas.DataFrame
    Mapping table output from `get_mapping_scores` (second output).

align_thr: float, optional, default 0.1
    The alignment score threshold below which to remove cell type mappings.

species_order: list, optional, default None
    Specify the order of species (left-to-right) in the sankey plot.
    For example, `species_order=['hu','le','ms']`.

Keyword arguments
-----------------
Keyword arguments will be passed to `sankey.opts`.
"""
if species_order is not None:
    ids = np.array(species_order)
else:
    ids = np.unique([x.split('_')[0] for x in M.index])

d = M.values.copy()
d[d < align_thr] = 0
x, y = d.nonzero()
x, y = np.unique(np.sort(np.vstack((x, y)).T, axis=1), axis=0).T
values = d[x, y]
nodes = M.index.to_numpy()

node_pairs = nodes[np.vstack((x, y)).T]
sn1 = np.array([xi.split('_')[0] for xi in node_pairs[:, 0]])
sn2 = np.array([xi.split('_')[0] for xi in node_pairs[:, 1]])

filt = np.zeros_like(sn1, dtype=bool)
for i in range(len(ids) - 1):
    for j in range(i + 1, len(ids)):
        filt = np.logical_or(filt, np.logical_or(
            np.logical_and(sn1 == ids[i], sn2 == ids[j]),
            np.logical_and(sn1 == ids[j], sn2 == ids[i])
        ))

x, y, values = x[filt], y[filt], values[filt]

d = dict(zip(ids, list(np.arange(len(ids)))))
depth_map = dict(zip(nodes, [d[xi.split('_')[0]] for xi in nodes]))
data = nodes[np.vstack((x, y))].T
for i in range(data.shape[0]):
    if d[data[i, 0].split('_')[0]] > d[data[i, 1].split('_')[0]]:
        data[i, :] = data[i, ::-1]
R = pd.DataFrame(data=data, columns=['source', 'target'])
R['Value'] = values


# Adjust the order of nodes to ensure that they are placed in columns
node_sort_key = {species: i for i, species in enumerate(ids)}
R['source_order'] = R['source'].apply(lambda x: node_sort_key[x.split('_')[0]])
R['target_order'] = R['target'].apply(lambda x: node_sort_key[x.split('_')[0]])
R = R.sort_values(by=['source_order', 'target_order'])

def f(plot, element):
    plot.handles['plot'].sizing_mode = 'scale_width'
    plot.handles['plot'].x_range.start = -600
    plot.handles['plot'].x_range.end = 1500

sankey1 = hv.Sankey(R, kdims=["source", "target"], vdims="Value")

cmap = params.get('cmap', 'Colorblind')
label_position = params.get('label_position', 'right')
edge_line_width = params.get('edge_line_width', 0)
show_values = params.get('show_values', False)
node_padding = params.get('node_padding', 4)
node_alpha = params.get('node_alpha', 1)
node_width = params.get('node_width', 30)
node_sort = params.get('node_sort', True)
frame_height = params.get('frame_height', 1000)
frame_width = params.get('frame_width', 800)
bgcolor = params.get('bgcolor', 'snow')
apply_ranges = params.get('apply_ranges', True)

sankey1.opts(cmap=cmap, label_position=label_position, edge_line_width=edge_line_width, show_values=show_values,
             node_padding=node_padding, node_cmap=depth_map, node_alpha=node_alpha, node_width=node_width,
             node_sort=node_sort, frame_height=frame_height, frame_width=frame_width, bgcolor=bgcolor,
             apply_ranges=apply_ranges, hooks=[f])

return sankey1

Jul 07 '24 19:07 DiracZhu1998