rgeoda icon indicating copy to clipboard operation
rgeoda copied to clipboard

Skater crashing R

Open eestefaniasalazar opened this issue 1 year ago • 11 comments

I updated to the latest version (0.10.4) because the "maxp_greedy" function was crashing R (see issue #39), but it seems that the bug wasn´t corrected for the "skater" function.

eestefaniasalazar avatar Jul 08 '23 09:07 eestefaniasalazar

Hey @lixun910, is there an ETA on this?

ashirwad avatar Dec 08 '23 21:12 ashirwad

Which OS did you use? The skater seems working fine on my MacOS… Thanks!

lixun910 avatar Dec 08 '23 22:12 lixun910

I am using RStudio Server on Ubuntu 20.04. Interestingly, things work as expected when I reduce the number of rows in the data! Is there a limit on how much data rgeoda::skater function can handle? The total observations that I have is ~1800.

ashirwad avatar Dec 08 '23 23:12 ashirwad

There is no limitation of the data size. I think it maybe other things causing the crash, like invalid values or connectivity structure. Is it possible to share your data and steps with me to replicate? Thanks!

lixun910 avatar Dec 09 '23 00:12 lixun910

@lixun910, here's a reprex:

# tigris version 2.0.1
# rgeoda version 0.0.10.4
# dplyr version 1.1.1
# sf version 1.0.12
set.seed(100)
ca_zctas <- tigris::zctas(year = 2010, state = "CA") |>
  dplyr::mutate(value = rexp(dplyr::n()))
ca_queen_w <- rgeoda::queen_weights(ca_zctas)
ca_zcta_clusters <- rgeoda::skater(
  5, ca_queen_w, dplyr::select(ca_zctas, value)
)
ca_zcta_clusters

Running the above code block causes R to crash! Also, here's the session info:

─ Session info ───────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.2.3 (2023-03-15)
 os       Ubuntu 20.04.5 LTS
 system   x86_64, linux-gnu
 ui       RStudio
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/Chicago
 date     2023-12-11
 rstudio  2023.03.0+386 Cherry Blossom (server)
 pandoc   3.1.2 @ /usr/bin/ (via rmarkdown)

ashirwad avatar Dec 11 '23 18:12 ashirwad

@lixun910, when do you anticipate this will get fixed? Just curious!

ashirwad avatar Dec 13 '23 18:12 ashirwad

Thanks for checking @ashirwad! I checked your data, and noticed that the connectivity of the queen weights is incomplete since there are many islands in this dataset. We should give a warning instead of a hard crash. Instead, you can try to use e.g. KNN weights in SKATER. I will fix this hard crash in next release. Will keep you updated.

lixun910 avatar Dec 13 '23 21:12 lixun910

Thanks, @lixun910, for the advice! I will try using KNN weights.

ashirwad avatar Dec 13 '23 21:12 ashirwad

@lixun910, is there a rule of thumb for selecting the value for k in KNN weights, or is it arbitrary?

ashirwad avatar Dec 14 '23 00:12 ashirwad

The number of k really depends on your data and the purpose of how the weights will been used. You can try to use the GeoDa desktop software to check and explore the connectivity map/graph for different k values, see https://geodacenter.github.io/workbook/4a_contig_weights/lab4a.html#fig:contigmapselect. For spatial clustering, different weights could lead to different connectivity graph and then different results. But at least we need a weights that can generate a complete connectivity graph. Hope this info helps. Thanks!

On Dec 13, 2023, at 5:34 PM, Ashirwad Barnwal @.***> wrote:



@lixun910https://urldefense.com/v3/__https://github.com/lixun910__;!!BpyFHLRN4TMTrA!-omVuK5OKg3MiwF4bc0VP9adNEzoILzBJceMCdP5QJxh5eBUCT-W2eX383laitfHtGBPCh13p6lTB_v1f0IjRJsecw$, is there a rule of thumb for selecting the value for k in KNN weights, or is it arbitrary?

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/GeoDaCenter/rgeoda/issues/43*issuecomment-1854903928__;Iw!!BpyFHLRN4TMTrA!-omVuK5OKg3MiwF4bc0VP9adNEzoILzBJceMCdP5QJxh5eBUCT-W2eX383laitfHtGBPCh13p6lTB_v1f0IhdkbY-A$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AASPYTYLKO2XVEKBUXEZQ7LYJJCRLAVCNFSM6AAAAAA2CX7AB2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJUHEYDGOJSHA__;!!BpyFHLRN4TMTrA!-omVuK5OKg3MiwF4bc0VP9adNEzoILzBJceMCdP5QJxh5eBUCT-W2eX383laitfHtGBPCh13p6lTB_v1f0I7QnFMCQ$. You are receiving this because you were mentioned.Message ID: @.***>

lixun910 avatar Dec 14 '23 01:12 lixun910

@lixun910, thanks for the ideas!

ashirwad avatar Dec 16 '23 08:12 ashirwad