ggwordcloud
ggwordcloud copied to clipboard
A issue of too much space between words
Dear creator
I have a problem while using geom_text_worcloud. The word spacing is too large. That happens even when I use the exactly the same code as yours. I find two questions online regarding to the same issue, but didn't find a solution. Can you let me know why? Thank you!
This is desired, but using wordcloud() command.
This is undesired with too much spacing while using geom_text_wordcloud with ggplot():
I have this problem, too!
Could you send me the code as well as the platform you are using. This looks like a difference between the font used to compute the boxes and the one used finally...
Le mer. 22 déc. 2021 à 13:55, Mathias Gerl @.***> a écrit :
I have this problem, too!
— Reply to this email directly, view it on GitHub https://github.com/lepennec/ggwordcloud/issues/15#issuecomment-999556009, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4BD2NQUEMZJIVC5H2BMPTUSHDDBANCNFSM5JYAIGAA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi,
I created a regex. See below!
Also I noticed that ggwordcloud(v.0.5.0.9000)
seems to work, while
ggwordcloud_0.5.0
does produce the output below.
I do not select an specific font, but have problems with your non-latin characters in your examples, therefore I only use latin characters in my example.
I hope this helps.
cheers
library(tidyverse)
library(ggwordcloud)
normal_font_love <- love_words %>%
filter(grepl("^[a-z]*$",word, ignore.case = T)) %>%
.[1:30,]
set.seed(42)
p <- ggplot(normal_font_love, aes(label = word, size = speakers)) +
geom_text_wordcloud() +
scale_size_area(max_size = 30) +
theme_minimal()
ggsave("love_words_small_R.png",
height = 5,
width = 10)
sessionInfo()
#> R version 4.1.2 (2021-11-01)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Monterey 12.1
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] ggwordcloud_0.5.0 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7
#> [5] purrr_0.3.4 readr_2.0.2 tidyr_1.1.4 tibble_3.1.6
#> [9] ggplot2_3.3.3 tidyverse_1.3.1
#>
#> loaded via a namespace (and not attached):
#> [1] tidyselect_1.1.1 xfun_0.28 haven_2.4.3 colorspace_2.0-2
#> [5] vctrs_0.3.8 generics_0.1.1 htmltools_0.5.2 yaml_2.2.1
#> [9] utf8_1.2.2 rlang_0.4.12 pillar_1.6.4 glue_1.5.0
#> [13] withr_2.4.2 DBI_1.1.1 dbplyr_2.1.1 modelr_0.1.8
#> [17] readxl_1.3.1 lifecycle_1.0.1 cellranger_1.1.0 munsell_0.5.0
#> [21] gtable_0.3.0 rvest_1.0.2 evaluate_0.14 labeling_0.4.2
#> [25] knitr_1.36 tzdb_0.2.0 fastmap_1.1.0 fansi_0.5.0
#> [29] highr_0.9 Rcpp_1.0.7 broom_0.7.10 backports_1.3.0
#> [33] scales_1.1.1 jsonlite_1.7.2 farver_2.1.0 fs_1.5.0
#> [37] png_0.1-7 hms_1.1.1 digest_0.6.28 stringi_1.7.5
#> [41] grid_4.1.2 cli_3.1.0 tools_4.1.2 magrittr_2.0.1
#> [45] crayon_1.4.2 pkgconfig_2.0.3 ellipsis_0.3.2 xml2_1.3.2
#> [49] reprex_2.0.1 lubridate_1.8.0 assertthat_0.2.1 rmarkdown_2.11
#> [53] httr_1.4.2 rstudioapi_0.13 R6_2.5.1 compiler_4.1.2
Created on 2021-12-23 by the reprex package (v2.0.1)
With the following output:
Thank you. I know more or less what's going on... It seems that nothing is working as planned when computing the text masks and very crude rectangular bounding boxes are used instead. I do not have access to a Mac os but I will see if I can find a workaround. I do not plan to do it during the holidays but I will try to do this as soon as possible.
Le jeu. 23 déc. 2021 à 00:18, Mathias Gerl @.***> a écrit :
Hi, I created a regex. See below! Also I noticed that ggwordcloud(v.0.5.0.9000) seems to work, while ggwordcloud_0.5.0 does produce the output below.
I do not select an specific font, but have problems with your non-latin characters in your examples, therefore I only use latin characters in my example.
I hope this helps.
cheers
library(tidyverse) library(ggwordcloud) normal_font_love <- love_words %>% filter(grepl("^[a-z]*$",word, ignore.case = T)) %>% .[1:30,]
set.seed(42)p <- ggplot(normal_font_love, aes(label = word, size = speakers)) + geom_text_wordcloud() + scale_size_area(max_size = 30) + theme_minimal()
ggsave("love_words_small_R.png", height = 5, width = 10)
sessionInfo()#> R version 4.1.2 (2021-11-01)#> Platform: aarch64-apple-darwin20 (64-bit)#> Running under: macOS Monterey 12.1#> #> Matrix products: default#> BLAS: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRblas.0.dylib#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib#> #> locale:#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8#> #> attached base packages:#> [1] stats graphics grDevices utils datasets methods base #> #> other attached packages:#> [1] ggwordcloud_0.5.0 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7 #> [5] purrr_0.3.4 readr_2.0.2 tidyr_1.1.4 tibble_3.1.6 #> [9] ggplot2_3.3.3 tidyverse_1.3.1 #> #> loaded via a namespace (and not attached):#> [1] tidyselect_1.1.1 xfun_0.28 haven_2.4.3 colorspace_2.0-2#> [5] vctrs_0.3.8 generics_0.1.1 htmltools_0.5.2 yaml_2.2.1 #> [9] utf8_1.2.2 rlang_0.4.12 pillar_1.6.4 glue_1.5.0 #> [13] withr_2.4.2 DBI_1.1.1 dbplyr_2.1.1 modelr_0.1.8 #> [17] readxl_1.3.1 lifecycle_1.0.1 cellranger_1.1.0 munsell_0.5.0 #> [21] gtable_0.3.0 rvest_1.0.2 evaluate_0.14 labeling_0.4.2 #> [25] knitr_1.36 tzdb_0.2.0 fastmap_1.1.0 fansi_0.5.0 #> [29] highr_0.9 Rcpp_1.0.7 broom_0.7.10 backports_1.3.0 #> [33] scales_1.1.1 jsonlite_1.7.2 farver_2.1.0 fs_1.5.0 #> [37] png_0.1-7 hms_1.1.1 digest_0.6.28 stringi_1.7.5 #> [41] grid_4.1.2 cli_3.1.0 tools_4.1.2 magrittr_2.0.1 #> [45] crayon_1.4.2 pkgconfig_2.0.3 ellipsis_0.3.2 xml2_1.3.2 #> [49] reprex_2.0.1 lubridate_1.8.0 assertthat_0.2.1 rmarkdown_2.11 #> [53] httr_1.4.2 rstudioapi_0.13 R6_2.5.1 compiler_4.1.2
Created on 2021-12-23 by the reprex package https://reprex.tidyverse.org (v2.0.1)
With the following output: [image: love_words_small_R] https://user-images.githubusercontent.com/24799198/147165339-f37bc827-7d32-4c29-9ad0-768f8f832c12.png
— Reply to this email directly, view it on GitHub https://github.com/lepennec/ggwordcloud/issues/15#issuecomment-999936035, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4BD2KQVWJ4OX26ZM73VTDUSJME7ANCNFSM5JYAIGAA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you commented.Message ID: @.***>
Hi,
Can you try with the latest version available on GitHub?
Yours
Le ven. 9 sept. 2022 à 19:18, Carlos López de la Cerda < @.***> a écrit :
Hi @lepennec https://github.com/lepennec!
I have the same problem as @dernesa https://github.com/dernesa, did you find any work around?
— Reply to this email directly, view it on GitHub https://github.com/lepennec/ggwordcloud/issues/15#issuecomment-1242245630, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4BD2LN2XZ4OISANOMBIVTV5NWNZANCNFSM5JYAIGAA . You are receiving this because you were mentioned.Message ID: @.***>
Hi I wonder if there is a solution to this problem? I'm having the same problem and have installed the developmental version but still having too much space between words. Could you help me fix this? Thank you!
Are you also using OS X?
Le lun. 17 oct. 2022 à 11:18, Meng Liu @.***> a écrit :
Hi I wonder if there is a solution to this problem? I'm having the same problem and have installed the developmental version but still having too much space between words. Could you help me fix this? Thank you!
— Reply to this email directly, view it on GitHub https://github.com/lepennec/ggwordcloud/issues/15#issuecomment-1280545007, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4BD2PEZR7E2WKBX4CC7ETWDUKV7ANCNFSM5JYAIGAA . You are receiving this because you were mentioned.Message ID: @.***>
Yes! This is my session_info in case it's helpful: R version 4.2.1 (2022-06-23) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Monterey 12.6
Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggwordcloud_0.6.0 ggthemes_4.2.4 rio_0.5.29 forcats_0.5.1
[6] stringr_1.4.0 dplyr_1.0.9 purrr_0.3.5 readr_2.1.2 tidyr_1.2.0
[11] tibble_3.1.7 ggplot2_3.3.6 tidyverse_1.3.2 pacman_0.5.1
Thank you. I need to have access to a macOS to understand exactly what is going on...
On Mon, Oct 17, 2022 at 11:50 AM Meng Liu @.***> wrote:
Yes! This is my session_info in case it's helpful: R version 4.2.1 (2022-06-23) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Monterey 12.6
Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] ggwordcloud_0.6.0 ggthemes_4.2.4 rio_0.5.29 forcats_0.5.1 [6] stringr_1.4.0 dplyr_1.0.9 purrr_0.3.5 readr_2.1.2 tidyr_1.2.0 [11] tibble_3.1.7 ggplot2_3.3.6 tidyverse_1.3.2 pacman_0.5.1
— Reply to this email directly, view it on GitHub https://github.com/lepennec/ggwordcloud/issues/15#issuecomment-1280591829, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4BD2JPWR6KO2ZDIGYI7O3WDUONTANCNFSM5JYAIGAA . You are receiving this because you were mentioned.Message ID: @.***>
Hi everyone. I managed to use the developer version [0.6.0] and it fixed the spacing issue.
I had resolved the non-latin font issue following this great explanation on stackoverflow.
One problem I do see now is word sizes. 愛, with speakers number of 1200, looks almost four times as big as the "Love", with speakers number of 800. FYI @lepennec. If you want me to open a new issue let me know.
Below is the script and the session info.
library(tidyverse) #> Loading required package: ggplot2
library(ggwordcloud)
library(showtext)
# Find usable Font
(where <- font_files()[which(str_detect(font_files()$family, "Arial Unicode MS")), ])
# add the font to the workspace
font_add(family = where[1, ]$family, regular = where[1, ]$file)
showtext_auto()
# Load data
data("love_words_small")
set.seed(42)
# Wordcloud with size attribute
ggplot(data = love_words_small, aes(label = word, size = speakers)) +
geom_text_wordcloud_area(
# family name of the font
family = where[1, ]$family) +
scale_size_area(max_size = 24) +
theme_minimal()
sessionInfo()
# R version 4.3.0 (2023-04-21)
# Platform: aarch64-apple-darwin20 (64-bit)
# Running under: macOS Ventura 13.4.1
# Matrix products: default
# BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
# LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
# time zone: Europe/London
# tzcode source: internal
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
# other attached packages:
# [1] showtext_0.9-6 showtextdb_3.0 sysfonts_0.8.8 ggwordcloud_0.6.0 lubridate_1.9.2
# [6] forcats_1.0.0 stringr_1.5.0 dplyr_1.1.2 purrr_1.0.1 readr_2.1.4
# [11] tidyr_1.3.0 tibble_3.2.1 ggplot2_3.4.2 tidyverse_2.0.0
# loaded via a namespace (and not attached):
# [1] gtable_0.3.3 compiler_4.3.0 tidyselect_1.2.0 Rcpp_1.0.11 xml2_1.3.5 scales_1.2.1
# [7] png_0.1-8 R6_2.5.1 labeling_0.4.2 commonmark_1.9.0 generics_0.1.3 munsell_0.5.0
# [13] pillar_1.9.0 tzdb_0.4.0 rlang_1.1.1 utf8_1.2.3 stringi_1.7.12 xfun_0.39
# [19] timechange_0.2.0 cli_3.6.1 withr_2.5.0 magrittr_2.0.3 grid_4.3.0 gridtext_0.1.5
# [25] rstudioapi_0.14 markdown_1.7 hms_1.1.3 lifecycle_1.0.3 vctrs_0.6.3 glue_1.6.2
# [31] farver_2.1.1 fansi_1.0.4 colorspace_2.1-0 tools_4.3.0 pkgconfig_2.0.3
Let me see what I can do...
On Mon, Jul 24, 2023 at 11:22 AM EdwardL08 @.***> wrote:
Hi everyone. I managed to use the developer version [0.6.0] and it fixed the spacing issue.
I had resolved the non-latin font issue following this great explanation on stackoverflow https://stackoverflow.com/questions/74415534/why-do-characters-from-foreign-alphabets-not-show-in-my-wordcloud-on-r .
One problem I do see now is word sizes. 愛, with speakers number of 1200, looks almost four times as big as the "Love", with speakers number of 800. FYI @lepennec https://github.com/lepennec. If you want me to open a new issue let me know.
[image: love_words_small_wordsize] https://user-images.githubusercontent.com/65240598/255545775-552897df-4f1e-4bbd-babf-390abbef589b.png
Below is the script and the session info.
library(tidyverse) #> Loading required package: ggplot2 library(ggwordcloud) library(showtext)
Find usable Font
(where <- font_files()[which(str_detect(font_files()$family, "Arial Unicode MS")), ])# add the font to the workspace font_add(family = where[1, ]$family, regular = where[1, ]$file) showtext_auto()
Load data
data("love_words_small")
set.seed(42)# Wordcloud with size attribute ggplot(data = love_words_small, aes(label = word, size = speakers)) + geom_text_wordcloud_area( # family name of the font family = where[1, ]$family) + scale_size_area(max_size = 24) + theme_minimal()
sessionInfo()# R version 4.3.0 (2023-04-21)# Platform: aarch64-apple-darwin20 (64-bit)# Running under: macOS Ventura 13.4.1
Matrix products: default# BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib # LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/London# tzcode source: internal
attached base packages:# [1] stats graphics grDevices utils datasets methods base
other attached packages:# [1] showtext_0.9-6 showtextdb_3.0 sysfonts_0.8.8 ggwordcloud_0.6.0 lubridate_1.9.2 # [6] forcats_1.0.0 stringr_1.5.0 dplyr_1.1.2 purrr_1.0.1 readr_2.1.4 # [11] tidyr_1.3.0 tibble_3.2.1 ggplot2_3.4.2 tidyverse_2.0.0
loaded via a namespace (and not attached):# [1] gtable_0.3.3 compiler_4.3.0 tidyselect_1.2.0 Rcpp_1.0.11 xml2_1.3.5 scales_1.2.1 # [7] png_0.1-8 R6_2.5.1 labeling_0.4.2 commonmark_1.9.0 generics_0.1.3 munsell_0.5.0 # [13] pillar_1.9.0 tzdb_0.4.0 rlang_1.1.1 utf8_1.2.3 stringi_1.7.12 xfun_0.39 # [19] timechange_0.2.0 cli_3.6.1 withr_2.5.0 magrittr_2.0.3 grid_4.3.0 gridtext_0.1.5 # [25] rstudioapi_0.14 markdown_1.7 hms_1.1.3 lifecycle_1.0.3 vctrs_0.6.3 glue_1.6.2 # [31] farver_2.1.1 fansi_1.0.4 colorspace_2.1-0 tools_4.3.0 pkgconfig_2.0.3
— Reply to this email directly, view it on GitHub https://github.com/lepennec/ggwordcloud/issues/15#issuecomment-1647539729, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4BD2OVDIG5W2Z3BZP67KDXRY5HFANCNFSM5JYAIGAA . You are receiving this because you were mentioned.Message ID: @.***>