RMySQL icon indicating copy to clipboard operation
RMySQL copied to clipboard

Very slow SELECT queries

Open imcullan opened this issue 8 years ago • 2 comments

Hello, Below is my sessioninfo():

> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.6 (El Capitan)

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] highcharter_0.4.0.9999 RJDBC_0.2-5            rJava_0.9-8            RMySQL_0.10.9         
 [5] DBI_0.4-1              gganimate_0.1          sp_1.2-3               data.table_1.9.6      
 [9] dplyr_0.5.0            png_0.1-7              jpeg_0.1-8             httr_1.2.1            
[13] plotly_3.6.0           ggplot2_2.2.0          RCurl_1.95-4.8         bitops_1.0-6          
[17] jsonlite_1.0          

loaded via a namespace (and not attached):
 [1] zoo_1.7-13        reshape2_1.4.2    purrr_0.2.2       lattice_0.20-33   colorspace_1.2-6 
 [6] htmltools_0.3.5   viridisLite_0.1.3 yaml_2.1.13       base64enc_0.1-3   chron_2.3-47     
[11] TTR_0.23-1        plyr_1.8.4        quantmod_0.4-5    stringr_1.1.0     munsell_0.4.3    
[16] gtable_0.2.0      htmlwidgets_0.7   psych_1.6.6       labeling_0.3      parallel_3.3.0   
[21] curl_1.1          broom_0.4.1       xts_0.9-7         Rcpp_0.12.8       scales_0.4.1     
[26] gridExtra_2.2.1   mnormt_1.5-4      digest_0.6.10     stringi_1.1.2     rlist_0.4.6.1    
[31] tools_3.3.0       magrittr_1.5      lazyeval_0.2.0    tibble_1.2        tidyr_0.5.1      
[36] lubridate_1.5.6   assertthat_0.1    viridis_0.3.4     R6_2.1.2          igraph_1.0.1     
[41] nlme_3.1-128    

I have approximately 20 million rows in my database and I have the Play table indexed on game_id and play_id. I am running the following query in R:

system.time(sample_play <- dbGetQuery(nbacon, 
paste0("SELECT p.player_id, p.lastname, 
              p.firstname, p.jersey, p.x_loc, p.y_loc, p.game_clock, p.shot_clock, p.home_description, 
              p.visitor_description, t.teamname, t.teamabbrev
              FROM Play p
              INNER JOIN Team t ON
              p.team_id = t.team_id
              WHERE game_id = '0021500290' AND play_id = 218)))

This returns 7150 rows of 12 variables. The total time it takes is:

user system elapsed 0.023 0.014 8.489

It seems awfully long to fetch 7150 rows in 8.5 seconds.

That being said, I am running the exact same query on MySQLWorkbench and it takes:

Duration / Fetch Time 0.734 sec / 0.0039 sec

Any ideas on how to speed this up? All help is greatly appreciated!

imcullan avatar Jan 18 '17 21:01 imcullan

I am running into this same issue. Are there any updates on this issue?

maryskalicky avatar Dec 29 '17 14:12 maryskalicky

I am also experiencing this issue.

dalekube avatar Mar 19 '19 02:03 dalekube