ComplexHeatmap icon indicating copy to clipboard operation
ComplexHeatmap copied to clipboard

Heatmap with pre-defined dendrogram, margin ordering

Open micans opened this issue 1 year ago • 9 comments

Hello, thank you so much for this fantastic library that I've recently started using. In one of my use cases I have a hiearchical clustering on the columns that is predefined. I read it in in Newick format (with the DECIPHER package), and pass it along as the clusters_columns value. This works, but the heatmap is reordered unless I label the nodes in the tree in alphabetical order from left to right. Example

Data:

----> aa.txt
x1      x2      x3
1       2       12
2       3       13
3       4       14
<----

----> aa.nwk
(((x1:8,x2:11):14,x3:17):5)
<----
----> ab.nwk
(((x3:8,x2:11):14,x1:17):5)
<----

Code:

a <- as.matrix(read.table("aa.txt", header=TRUE))
d1 <- ReadDendrogram("aa.nwk")
d2 <- ReadDendrogram("ab.nwk")
cell_val = function(j, i, x, y, width, height, fill) { grid.text(sprintf("%d", a[i, j]), x, y, gp = gpar(fontsize = 12)) }
options(repr.plot.width = 4, repr.plot.height = 4)
ht1 <- Heatmap(a, cluster_rows = FALSE, cluster_columns=d1, cell_fun = cell_val)
ht2 <- Heatmap(a, cluster_rows = FALSE, cluster_columns=d2, cell_fun = cell_val)
draw(ht1)
draw(ht2)

yields

Screenshot 2022-07-08 at 16 09 49

Perhaps I'm missing an option or approach; for this use case the column order is already pre-determined, and matches the order in which labels are given in the Newick tree (reading left to right).

micans avatar Jul 08 '22 15:07 micans

According to the following two files:

----> aa.nwk
(((x1:8,x2:11):14,x3:17):5)
<----

----> ab.nwk
(((x3:8,x2:11):14,x1:17):5)
<----

The first one has the order x1, x2, x3 and the second one has the order x3, x2, x1, which are the same as in the heatmaps, aren't they? Or what do you expect?

jokergoo avatar Jul 27 '22 08:07 jokergoo

using

d2 <- ReadDendrogram("ab.nwk")
options(repr.plot.width = 4, repr.plot.height = 4)
ht2 <- Heatmap(a, cluster_rows = FALSE, column_dend_reorder=FALSE, cluster_columns=d2, cell_fun = cell_val)
draw(ht2)

I get the second output drawn above, that is, with columns reordered. As an aside, I get the same output if I use this dendrogram:

(((y3:8,y2:11):14,y1:17):5)

There are no warnings about names not matching. It seems as if the dendrogram is always drawn according to the dendrogram input, but columns are reordered according to the dendrogram names. If I use

(((y2:8,y1:11):14,y3:17):5)

for the dendrogram the output becomes

Screenshot 2022-07-28 at 16 49 51

What I need is for the columns not to be reordered; I can achieve this by making the dendrogram labels alphabetically ordered, but this depends then on the apparent absence of matching of names between dendrogram and columns.

micans avatar Jul 28 '22 15:07 micans

The column names are from the matrix not from the dendrogram.

jokergoo avatar Jul 28 '22 17:07 jokergoo

I understand that. The issue I try to describe above is that the heatmap column ordering always seems to be derived from the (alphabetic ordering of) the dendrogram names; the option column_dend_reorder does not seem to have an effect. The examples above illustrate this behaviour.

micans avatar Jul 28 '22 17:07 micans

I see. I think it related when converting from the "Newick" format to a dendrogram, how the order of leave is calculated. See the following four experiments:

> d1 = ReadDendrogram(textConnection("(((x1:8,x2:11):14,x3:17):5)"))
> d2 = ReadDendrogram(textConnection("(((x3:8,x2:11):14,x1:17):5)"))
> d3 = ReadDendrogram(textConnection("(((y3:8,y2:11):14,y1:17):5)"))
> d4 = ReadDendrogram(textConnection("(((y2:8,y1:11):14,y3:17):5)"))
>
> str(d1)
--[dendrogram w/ 1 branches and 3 members at h = 30]
  `--[dendrogram w/ 2 branches and 3 members at h = 25]
     |--[dendrogram w/ 2 branches and 2 members at h = 11]
     |  |--leaf "x1" (h= 3  )
     |  `--leaf "x2"
     `--leaf "x3" (h= 8  )
> str(d2)
--[dendrogram w/ 1 branches and 3 members at h = 30]
  `--[dendrogram w/ 2 branches and 3 members at h = 25]
     |--[dendrogram w/ 2 branches and 2 members at h = 11]
     |  |--leaf "x3" (h= 3  )
     |  `--leaf "x2"
     `--leaf "x1" (h= 8  )
> str(d3)
--[dendrogram w/ 1 branches and 3 members at h = 30]
  `--[dendrogram w/ 2 branches and 3 members at h = 25]
     |--[dendrogram w/ 2 branches and 2 members at h = 11]
     |  |--leaf "y3" (h= 3  )
     |  `--leaf "y2"
     `--leaf "y1" (h= 8  )
> str(d4)
--[dendrogram w/ 1 branches and 3 members at h = 30]
  `--[dendrogram w/ 2 branches and 3 members at h = 25]
     |--[dendrogram w/ 2 branches and 2 members at h = 11]
     |  |--leaf "y2" (h= 3  )
     |  `--leaf "y1"
     `--leaf "y3" (h= 8  )
> order.dendrogram(d1)
[1] 1 2 3
> order.dendrogram(d2)
[1] 3 2 1
> order.dendrogram(d3)
[1] 3 2 1
> order.dendrogram(d4)
[1] 2 1 3

As you can see, in the conversion, the order of dendrogram leave is from the leaf names.

To make the tree correctly correspond to the heatmap columns, I think it is better to use the heatmap column names as the leaf names for the Newick leave.

jokergoo avatar Jul 28 '22 19:07 jokergoo

There is still the issue that the heatmap is ordered by alphabetic ordering of the dendrogram names, I've not been able to avoid this. With this matrix:

x3      x2      x1
1       2       12
2       3       13
3       4       14

dendrogram (((x3:8,x2:11):14,x1:17):5) and code

a <- as.matrix(read.table("aa.txt", header=TRUE))
d1 <- ReadDendrogram("aa.nwk")
cell_val = function(j, i, x, y, width, height, fill) {
        grid.text(sprintf("%d", a[i, j]), x, y, gp = gpar(fontsize = 12)) }
options(repr.plot.width = 4, repr.plot.height = 4)
ht1 <- Heatmap(a, cluster_rows = FALSE, column_dend_reorder=FALSE, cluster_columns=d1, cell_fun = cell_val)
draw(ht1)
a

the output is (with column ordering x1 x2 x3 in the heatmap)

Screenshot 2022-07-28 at 21 38 13

micans avatar Jul 28 '22 20:07 micans

I think I was wrong before. x1 x2 x3 in the Newick file actually has nothing to do with x1 x2 x3 in the matrix. Here is the process.

With this Newick output:

(((x3:8,x2:11):14,x1:17):5)

The order is calculated as 3, 2, 1 and now x1 x2 x3 in Newick are not used any more and they are converted to 3, 2 ,1.

With the order 3, 2, 1 and with the matrix:

x3      x2      x1
1       2       12
2       3       13
3       4       14

it is reordered to the following matrix which is the same as your last heatmap:

x1      x2      x3
12       2       1
13       3       2
14       4       3

Note x3 x2 x1 in the matrix has not effect to determine the column orders.

jokergoo avatar Jul 28 '22 21:07 jokergoo

Yes, that matches my experiences. Hence, the only way, as far as I can see, to leave the matrix unchanged by ComplexHeatmap is to have the Newick identifiers alphabetically sorted as they appear from left to right in the tree. This took me a while to figure out, it may be worth flagging this.

micans avatar Jul 28 '22 21:07 micans

That is something I will improve in the future. I will keep this ticket open. Once I have done something, I will reply here.

jokergoo avatar Jul 28 '22 21:07 jokergoo