ggcorrplot2 icon indicating copy to clipboard operation
ggcorrplot2 copied to clipboard

Reimplement the various visualizing methods by creating geom layers

Open caijun opened this issue 5 years ago • 3 comments

For example, geom_eclipse(), geom_circle(), geom_square() etc. for upper triangle matrix. The full type can be generated by overlaying a upper type on a lower type.

  • [x] https://bookdown.org/rdpeng/RProgDA/building-new-graphical-elements.html
  • [ ] https://cran.r-project.org/web/packages/ggplot2/vignettes/extending-ggplot2.html

caijun avatar Jan 02 '20 04:01 caijun

The idea is good. I would love to help with this if possible. Some thoughts below.

One way to implement this change is: leave the original ggcorrplot function largely similar, but when you define type = NULL, then only the base plot layer will be drawn, and you can do something like this:

geom_corrplot(corr, type = NULL) + 
geom_corrplot_ellipses(plot.type = "upper") + 
geom_corrplot_numbers(plot.type = "lower")

Is that what you had in mind?

The complexity of course is that there are many steps and layers, with some of them being conditional:

  • creating a data frame from the given correlation matrix (and p value matrix); delete unnecessary data.
  • creating a base plot layer
  • customizing legend
  • adding upper/lower/full part of the desired visualization
  • adding p value labels/information
  • adding and customizing labels.

I think that, for this idea to work, we need to split up the ggcorrplot function in some smaller functions for the different steps performed, so that you can reuse the elements if needed: one for creating data, one for adding labels, one for customizing the legend, etcetera. Advantage is that you can also easier add unit testing for them that way.

Furthermore, to create the different geom functions, the ellipse visualization would be the hardest to implement since you calculate the ellipses manually. It would help if we don't calculate the ellipses ourselves, but instead use the function ggforce::geom_ellipse while setting the ellipse parameters a and b. That way you don't have to calculate the ellipses and none of the corrplot geoms require a new stat, which makes the separate geom_corrplot parts easier to handle.

LDSamson avatar May 11 '22 15:05 LDSamson

See here; this version has quite some changes since I split up the main functions in several smaller ones. Some of the changes (I can make a more detailed log later if you want):

  • There are separate functions for adding labels, adding the guide+color scheme, and adding the different layers; therefore the function ggcorrplot.mixed is not needed anymore.
  • I tweaked the function for transforming the correlation matrix to a data frame a bit, so that the row names/column names correspond to the row id/column id, respectively. We can then use these names for the labels, which also fixes this Stackoverflow issue.
  • I also included unit tests for some of the functions with the testthat and vdiffr packages, so that it is easy to automatically check whether something has changed in an unwanted manner.

Let me know what you think. It can definitely still be improved but I think it is a good start in making the package more flexible. I can either create a pull request or I can make some changes first to tweak it further. EDIT: I see now that this version has still the legend.position option that was mentioned in ; we can remove that if you don't like it.

LDSamson avatar May 18 '22 02:05 LDSamson

@LDSamson Thanks very much for your efforts in improving ggcorrplot2 package. I would like to re-desigin this package so that it can behave more like the grammar of ggplot2 package. I would expect the following syntax,

ggplot(corr) + 
geom_circle(type = "upper") + 
geom_number(type = "lower")

We can also add layers such as geom_circle(), geom_square(), geom_ellipse(),and geom_number() etc. And these layers can be plotted in types of upper, lower and full.

caijun avatar Aug 16 '22 13:08 caijun