scanpy adjustText for `legend_loc="on data"` leged location

[x] Additional function parameters / changed functionality / changed defaults?
[ ] New analysis tool: A simple analysis tool you have been using and are missing in sc.tools?
[ ] New plotting function: A kind of plot you would like to seein sc.pl?
[ ] External tools: Do you know an existing package that should go into sc.external.*?
[ ] Other?

would be really cool to have adjustText for automatic ordering of text in sc.pl.embedding

https://github.com/Phlya/adjustText

has anybody ever looked into it?

Nov 27 '20 15:11 giovp

I've used it a bit, and have gotten nice results. I think I've mentioned it before (#938), but that was on an unrelated issue so it's good to have.

The results are nice:

Example usage

from adjustText import adjust_text

def gen_mpl_labels(
    adata, groupby, exclude=(), ax=None, adjust_kwargs=None, text_kwargs=None
):
    if adjust_kwargs is None:
        adjust_kwargs = {"text_from_points": False}
    if text_kwargs is None:
        text_kwargs = {}

    medians = {}

    for g, g_idx in adata.obs.groupby(groupby).groups.items():
        if g in exclude:
            continue
        medians[g] = np.median(adata[g_idx].obsm["X_umap"], axis=0)

    if ax is None:
        texts = [
            plt.text(x=x, y=y, s=k, **text_kwargs) for k, (x, y) in medians.items()
        ]
    else:
        texts = [ax.text(x=x, y=y, s=k, **text_kwargs) for k, (x, y) in medians.items()]

    adjust_text(texts, **adjust_kwargs)

with plt.rc_context({"figure.figsize": (8, 8), "figure.dpi": 300, "figure.frameon": False}):
    ax = sc.pl.umap(pbmc, color="Low-level celltypes", show=False, legend_loc=None, frameon=False)
    gen_mpl_labels(
        pbmc,
        "Low-level celltypes",
        exclude=("None",),  # This was before we had the `nan` behaviour
        ax=ax,
        adjust_kwargs=dict(arrowprops=dict(arrowstyle='-', color='black')),
        text_kwargs=dict(fontsize=14),
    )
    fig = ax.get_figure()
    fig.tight_layout()
    plt.show()

I believe you're also supposed to be able to make the text repel from points, so they don't sit on top of your data, but I had some trouble getting that working at the time.

I'm a bit antsy about having this as a required dependency since maintenance doesn't seem too active. Could be an optional dependency, used with legend_loc="adjust_text"?

Nov 28 '20 07:11 ivirshup

I have been unable to get this to look good by default. It can be made to look good by playing around with the parameters, but then we're not really saving the user much effort.

A strategy that seemed to work okay was to repel the labels from the points, followed by a second repulsion from other labels. But then I had to redraw the lines manually.

Current thoughts are to punt this down the road. Maybe there will be a better solution in the future, or maybe there's a clever parameterization fix I hadn't thought of.

May 12 '21 08:05 ivirshup

I never get adjustText to work without numerous rounds of parameter optimization, so yeah, I agree.

May 12 '21 17:05 gokceneraslan

Would love to see this works in scanpy. Some thoughts on auto. Can we pretend each cluster is a huge size dot (get the center by averaging it and get the size by get the volume of the cluster)? then we can use put text aiming to not overlap with that huge size dot.

Jun 11 '21 03:06 YubinXie

Maybe this library would help? https://github.com/TutteInstitute/datamapplot

It is pretty new, but looks promising and is maintained by @lmcinnes, the author of UMAP

Feb 21 '24 16:02 VladimirShitov

I also modified the code of @ivirshup a bit to colorize labels by their color on the scanpy plots:

from adjustText import adjust_text

def gen_mpl_labels(
    adata, groupby, exclude=(), ax=None, adjust_kwargs=None, text_kwargs=None, color_by_group=False
):
    if adjust_kwargs is None:
        adjust_kwargs = {"text_from_points": False}
    if text_kwargs is None:
        text_kwargs = {}

    medians = {}

    for g, g_idx in adata.obs.groupby(groupby).groups.items():
        if g in exclude:
            continue
        medians[g] = np.median(adata[g_idx].obsm["X_umap"], axis=0)

    # Fill the text colors dictionary
    text_colors = {group: None for group in adata.obs[groupby].cat.categories}

    if color_by_group and groupby + "_colors" in adata.uns:
        for i, group in enumerate(adata.obs[groupby].cat.categories):
            if group in exclude:
                continue
            text_colors[group] = adata.uns[groupby + "_colors"][i]

    if ax is None:
        texts = [
            plt.text(x=x, y=y, s=k, color=text_colors[k], **text_kwargs) for k, (x, y) in medians.items()
        ]
    else:
        texts = [ax.text(x=x, y=y, s=k, color=text_colors[k], **text_kwargs) for k, (x, y) in medians.items()]

    adjust_text(texts, **adjust_kwargs)

Looks a bit more readable when several labels are close to each other

aefd35ba-4f00-4ed8-93a4-dc2b312f800a

Feb 21 '24 16:02 VladimirShitov

@VladimirShitov can you give an example of how you use the gen_mpl_labels function? I tried it and got somewhat different results. For example it lacked the lines pointing to the cluster centers. Thanks!

Mar 08 '24 14:03 GirayEryilmaz

@GirayEryilmaz , sure! Here's how I used it:

with plt.rc_context({"figure.figsize": (8, 8), "figure.dpi": 150, "figure.frameon": False}):
    ax = sc.pl.umap(adata, color=cell_type_key, show=False, legend_loc=None, frameon=False)
    gen_mpl_labels(
        adata,
        cell_type_key,
        exclude=("None",),  # This was before we had the `nan` behaviour
        ax=ax,
        adjust_kwargs=dict(arrowprops=dict(arrowstyle='-', color='black')),
        text_kwargs=dict(fontsize=12, path_effects=[pe.withStroke(linewidth=1, foreground="darkgray")]),
        color_by_group=True
    )
    fig = ax.get_figure()
    fig.tight_layout()
    plt.show()

Mar 11 '24 11:03 VladimirShitov

@VladimirShitov can you give an example of how you use the gen_mpl_labels function? I tried it and got somewhat different results. For example it lacked the lines pointing to the cluster centers. Thanks!

Have you solved this problem? I still can't show the lines pointing to the cluster centers.

Apr 10 '24 13:04 nnnanchen

@VladimirShitov can you give an example of how you use the gen_mpl_labels function? I tried it and got somewhat different results. For example it lacked the lines pointing to the cluster centers. Thanks!

Have you solved this problem? I still can't show the lines pointing to the cluster centers.

Hi @nnnanchen. I apologize, I forgot to add the adjust_text call in the very last line of the gen_mpl_labels above. I edited the previous message. Can you try again?

Apr 10 '24 18:04 VladimirShitov

scanpy scanpy copied to clipboard

adjustText for `legend_loc="on data"` leged location

scanpy
scanpy copied to clipboard