heatmaply icon indicating copy to clipboard operation
heatmaply copied to clipboard

row/column label overplotting

Open alanocallaghan opened this issue 7 years ago • 28 comments

"The overall display needs adjustment after some zooming in/out practices. The row names overlay with each other when number of rows are increased (say more than 160)."

alanocallaghan avatar Jun 12 '17 09:06 alanocallaghan

The way to deal with this is if we could make the following work with dendrograms:

library(ggplot2)
library(plotly)
library(heatmaply)

heatmaply:::ggplot_heatmap(as.matrix(iris[,-5])) %>% ggplotly %>%
 layout(yaxis = list(tickmode='auto'),
              xaxis = list(tickmode='auto'))

I've asked about something similar here: https://github.com/ropensci/plotly/issues/1011

But I'm not sure how to make this work.

talgalili avatar Jun 12 '17 09:06 talgalili

There is basically only one way to make it work; this would be to round the x co-ordinates of the segments of the dendrograms to allow them to fit on a categorical scale (ie, the same scale as the main heatmap).

This results in a non-conventional (but not wrong) dendrogram

screenshot from 2017-06-12 19-42-41

alanocallaghan avatar Jun 12 '17 18:06 alanocallaghan

mmm, interesting. Is there no change that Carson could introduce that would save us from this? (I fear it would make the users nervous).

talgalili avatar Jun 12 '17 19:06 talgalili

It comes down to mixing categorical and numeric co-ordinate systems; you need to be able to provide common co-ordinates for the two axes. Thus, we need both to be numeric, and to manually specify the tick labels for the heatmap (the current default on master), or for both to categorical, and allow automatic labelling (the current implementation for this issue, when using plot_method="plotly" on branch tickmode.

I don't think Carson can change this; I'm pretty sure it's a limitation of the javascript library, since all of these parameters are passed directly to the javascript library. It might be worth raising the issue over at plotly.js HQ, but it's a bit of an edge case so might not have that much use or support.

alanocallaghan avatar Jun 12 '17 19:06 alanocallaghan

Actually, it might be worth raising the issue re: sampling of axis labels on the plotly.js repo, though it won't be quick to be added I imagine

alanocallaghan avatar Jun 12 '17 19:06 alanocallaghan

https://github.com/plotly/plotly.js/issues/1779

alanocallaghan avatar Jun 12 '17 19:06 alanocallaghan

I think it is worth making the case for it in the plotly repo (and even asking Carson to pay attention to it). This feature would help make the package more scalable for larger matrix while still keeping the visibility of the side labels usable.

talgalili avatar Jun 12 '17 19:06 talgalili

As I said it's not a plotly R library issue; these are options which are passed directly to the underlying javascript, so it's in the hands of Etienne Pinard and co

alanocallaghan avatar Jun 12 '17 19:06 alanocallaghan

I meant the plotly.js repo :) (and asking Carson to pay attention was so that it would hopefully get more attention from Etienne).

talgalili avatar Jun 12 '17 20:06 talgalili

Try using ggplotly(dynamicTicks = TRUE)

cpsievert avatar Jun 12 '17 20:06 cpsievert

Not what I would term desired output screenshot from 2017-06-12 21-51-22

alanocallaghan avatar Jun 12 '17 20:06 alanocallaghan

I'd consider this a bug in the R pkg. Should only be a problem when there is a colorbar and categorical axes.

cpsievert avatar Jun 12 '17 21:06 cpsievert

see https://github.com/ropensci/plotly/issues/1045

cpsievert avatar Jun 12 '17 21:06 cpsievert

Hi @cpsievert , I've just updated to the latest plotly version which includes your patch, but this doesn't seem to solve the issue. I've added a new dynamicTicks argument to heatmaply so you could more easily debug this:

devtools::install_github('talgalili/heatmaply')
devtools::install_github("ropensci/plotly")

library(heatmaply)
# heatmaply(mtcars)
heatmaply(mtcars, dynamicTicks = TRUE)

Output:

image

talgalili avatar Jun 13 '17 05:06 talgalili

I don't think this is a problem I can solve from plotly. You're using supplying different data for the x/y positions (discrete for the heatmap and numeric for the dendrogram). My advice would be to transform the dendrogram x/y to the corresponding discrete values when dynamicTicks = TRUE

cpsievert avatar Jun 22 '17 15:06 cpsievert

@cpsievert, I agree that's the most sensible way to do this, and I've implemented this with plotly. However I'm still having some problems with ggplotly. A reprex is below which you can use with the develop branch on this repo if you have time.

## Working
heatmaply(iris[, -5], dynamicTicks=TRUE, plot_method="plotly")

## broken
heatmaply(iris[, -5], dynamicTicks=TRUE)

## For debugging, return the ggplot objects
h <- heatmaply(iris[, -5], dynamicTicks=TRUE, return_ppxpy=TRUE)

## Everything seems in order...
h$p
h$px
h$py

## Subplotting works!
subplot(h$py, plotly_empty(), h$p, h$px, nrows=2)

## Wrong, but not the kind of wrong we're seeing...
subplot(h$py, plotly_empty(), ggplotly(h$p, dynamicTicks=TRUE), h$px, nrows=2)

## Gotcha!
subplot(h$py, plotly_empty(), ggplotly(h$p, dynamicTicks=TRUE), h$px, nrows=2, shareX = TRUE, shareY = TRUE)

So the issue seems to be sharing axes... I think it was okay before since the axis types were different (categorical and numeric). Do I need to change the axis IDs to control which axes are changed?

@talgalili since I've now fixed this for plot_method="plotly" I'll probably call it a day for the revision unless we can make some progress on this.

alanocallaghan avatar Jun 22 '17 20:06 alanocallaghan

So the issue seems to be sharing axes

Sure, but why should subplot() do the thing you're expecting it to do when h$p has discrete x/y scales and h$py/h$px has numeric x/y scales? You're trying to share two different scales...

cpsievert avatar Jun 22 '17 21:06 cpsievert

I edited the dendrograms to use a discrete scale to match with the heatmap's discrete scale; admittedly I had left coord_cartesian()/coord_flip() calls in that might have been causing problems, but removing them hasn't helped.

pg <- ggplot_build(h$p)
pg$layout$get_scales()
# $x
# <ggproto object: Class ScaleDiscretePosition, ScaleDiscrete, Scale>
# [etc]

pxg <- ggplot_build(h$px)
pxg$layout$get_scales(1)
# $x
# <ggproto object: Class ScaleDiscretePosition, ScaleDiscrete, Scale>
# [etc]

Whereas for a numeric scale:

g <- ggplot(mtcars, aes(x=mpg, y=cyl)) + geom_point()
gb <- ggplot_build(g)
gb$layout$get_scales(1)
# $x
# <ScaleContinuousPosition>
# [etc]
# $y
# <ScaleContinuousPosition>

alanocallaghan avatar Jun 23 '17 07:06 alanocallaghan

Also it's worth noting that subplot() does do the right thing, just not when dynamicTicks is enabled:

subplot(h$py, plotly_empty(), h$p, h$px, nrows=2, shareX=TRUE, shareY=TRUE)

alanocallaghan avatar Jun 23 '17 07:06 alanocallaghan

Sorry, I meant to say the data that you're supplying to x/y are different:

unique(h$p$data$column)
#> [1] Sepal.Length Petal.Length Sepal.Width  Petal.Width 
#> Levels: Sepal.Length Petal.Length Sepal.Width Petal.Width
unique(h$py$layers[[1]]$data$x)
#> [1] 2.125 1.000 3.250 2.500 2.000 3.000 4.000

Thinking about this a bit more, I'm not actually sure a plot like this (with the auto-updating ticks) is possible in plotly.js right now. And to that point, I should mention that the dynamicTicks=TRUE bit in heatmaply(iris[, -5], dynamicTicks=TRUE, plot_method="plotly") does nothing. The reason why it 'works' is because you're using tickvals/ticktext, which is not "dynamic" in the sense that plotly.js doesn't automatically (re)generate labels from the supplied data (but you really don't want plotly.js auto-generating labels since, in your case, the dendrogram requires positions that are between categories).

It sounds like what you need is the ability to have something like nticks to be respected when tickmode='array'

cpsievert avatar Jun 23 '17 16:06 cpsievert

Thanks @cpsievert

I added "dynamicTicks=TRUE" as a place-holder to play with. Seeing the correspondence here, it seems clearer that this option will not be available also in the future, so I will later remove this argument.

As for your other remark - would making "nticks be respected when tickmode='array'" something that is on your end, or requires plotly.js modification?

talgalili avatar Jun 23 '17 16:06 talgalili

It would have to be implemented at the plotly.js level

cpsievert avatar Jun 23 '17 16:06 cpsievert

On branch develop it is working correctly, see the example below. I've mapped the heatmap and dendrograms to the same discrete scale. I've also made the hovertext on the dendrograms show x and y co-ordinates for clarity (the heatmap is already showing that). Also, when dynamicTicks=TRUE and plot_method="plotly", tickvals and ticktext are not specified on the heatmap or dendrograms.

devtools::install_github("talgalili/heatmaply", ref="develop")
library(heatmaply)
heatmaply(iris[, -5], dynamicTicks=TRUE, plot_method="plotly")

alanocallaghan avatar Jun 23 '17 16:06 alanocallaghan

I would note that the y axis labels are numbers, but are character objects.

rownames(iris) <- paste("row", rownames(iris))
heatmaply(iris[, -5], dynamicTicks=TRUE, plot_method="plotly")

alanocallaghan avatar Jun 23 '17 16:06 alanocallaghan

@Alanocallaghan - it is also worth noting that the dendrograms are not "well" mapped to the centers as they should be:

image

(also there seems to be a space between the heatmap and the dendrogram)

I think that having the nticks work for tickmode='array'" sounds to be the more reasonable solution at this point (though, sadly, it would require someone to fix it in plotly.js...)

talgalili avatar Jun 23 '17 16:06 talgalili

screenshot from 2017-06-23 17-50-11 They are mapped to the centers, unfortunately rounding makes the dendrograms hard to interpret near the leaves (not that that is easy usually, but I digress).

The space between the heatmap and dendrogram is a minor issue and could easily be resolved by changing axis limits, I just haven't addressed it as this is still not resolved for plot_method="ggplot", or for side bar plots.

alanocallaghan avatar Jun 23 '17 16:06 alanocallaghan

Hi Alan, thanks for the explanation. Could you please see if you have ideas on how to improve this issue: https://github.com/plotly/plotly.js/issues/1812 If it could be simpler for the plotly.js people, I hope it would raise the chances they could address it sooner rather than later.

talgalili avatar Jun 23 '17 16:06 talgalili

I agree that the plotly fix is more sensible, however I am convinced that something is amiss in ggplotly. I don't see why the dynamic ticks argument should scramble the x/y co-ordinates, as shown above...

I am not optimistic about plotly.js fixing this in time for the revision, so my inclination would be to implement this bodge/hack as an option, and then roll out a proper fix when the nticks issue is addressed

@cpsievert running your code on the develop branch:

unique(h$p$data$column)
#[1] Sepal.Length Petal.Length Sepal.Width  Petal.Width 
#Levels: Sepal.Length Petal.Length Sepal.Width Petal.Width
unique(h$py$layers[[1]]$data$x)
#[1] Petal.Length Sepal.Length Sepal.Width  Petal.Width 
#Levels: Sepal.Length Petal.Length Sepal.Width Petal.Width

alanocallaghan avatar Jun 23 '17 16:06 alanocallaghan