RestRserve icon indicating copy to clipboard operation
RestRserve copied to clipboard

Add DB connection pool helper

Open dselivanov opened this issue 7 years ago • 8 comments

Needs IPC communication between Rserve child processes. Related to https://github.com/s-u/Rserve/issues/105

dselivanov avatar Oct 26 '18 11:10 dselivanov

Simply add note to the docs and close it :)

artemklevtsov avatar Aug 06 '19 06:08 artemklevtsov

There are 2 options:

  1. open and close connection for each request
  2. have a pool of connections initialized at start and re-use them

This issue is about 2

dselivanov avatar Aug 07 '19 06:08 dselivanov

Hi @dselivanov, I share my R script in the issue #25 comment in which I manage a pool connection before I create a new app. It is important to create a pool connection before creating a new app, if not it could create multiple pools (pb encountered in my case, perhaps my code is not good). Could you share us how one must manage the pool connection in RestRserve API method? Thank you.

long-do avatar Aug 20 '20 12:08 long-do

@long-do I'm think pool pkg won't work correctly as pool object is created in the parent R session. Child R sessions can use connections which are initialized in the parent, however they don't share pool object as it is a copy of the parent pool object on each child process. Which means they can't collaboratively use connections and it is highly probable that several processes might use then same connection. Which will lead to the corruption of the connection.

The easiest way to tackle this is to use RestRserve behind proxy (such as HAproxy) and let proxy manage persistent connections to the RestRserve. This way child session will hold the state till the time there is connection between proxy and RestRserve. In order to re-use DB connections you might need to structure your code in a way that it tries to perform DB query and if it fails you request handler re-initialise it.

library(RestRserve)

con = NULL

reconnect = function() {
  message("reconnecting")
  con <<- create_connection()
  invisible(TRUE)
}

app = Application$new()
app$add_get("/hello", function(req, res) {
  tryCatch(do_something_with_db(req), 
           error = function() {
             # re-init connections
             reconnect()
             # re-do your staff
             do_something_with_db(req)
           })
})

But keep in mind that if your app will create new connection for each request (this is by default for all new clients) then db connection will be open for each requests and it might cause significant overhead.

dselivanov avatar Aug 23 '20 07:08 dselivanov

Hi @dselivanov, Thank you for your reply. However, it is working very well in my case because in my API invoked function I declared response$body <- foo::bar(input = request$body, pool = pool_conn) and in my package R foo I use fetched_conn <- pool::poolCheckout(pool) so the connection pool is well managed and steady (time to fetch a new connection from the pool is about 0.001 - 0.002s). Note that I authorize 10 connections as min size of the pool. Of course in case of connection pooling lost I handle a new connection by tryCatch.

long-do avatar Aug 26 '20 09:08 long-do

Just a friendly request. Please add a vignette for working with Databases. Almost all of the R user work with databases.

I used RestRserve for a project and was not able to use pool because it leaked every few seconds. I had to create a new connection every request which had a huge overhead.

with pool -- 15 - 30 ms with conn -- 50 - 70 ms

This overhead and the fact that I had to use a HaProxy to keep all the connection open throughout allowed my IT team to reject the entire architecture because plumber was performing better and was more scalable without proxy server.

There are many people like me who will not be able to make it work in a production environment. Please write a page for best practises working with RestRserve as well.

I know it is time consuming to write documentation but it's very essential for new users and advanced alike. Even if you don't have a solution for it. It's good to document it. In future when you have one you can update the document.

Just wanted to share my experience with the project. Hope you will understand my point of view.

It's a nice project. Keep up the good work. 👍

vikram-rawat avatar Dec 23 '20 05:12 vikram-rawat

I'm doing my research. Has this issue of database connections across threads been solved in RestRserve?

I'm looking at using RestRserve but would need it to allow for a way to do db connections across threads/worker processes without too much overhead.

I like the fact that RestRserve handles each request on a separate fork automatically.

I was looking at using plumber, but I would have to use future or promises to allow for a non-blocking REST API - AND this doesn't allow for sharing database connections across processes.

Also, I don't like the comment style syntax plumber uses, and appreciate RestRserve's more classic look; looks like node express to me :)

I came across valve and faucet (seem to do the same thing; they both build on plumber) - which claim to run plumber multithreaded and solve the database connection issue across threads.

However, if RestRserve allows for pooled connections or a way to do it I would be interested in using RestRserve.

@vikram-rawat and @long-do I think you both seem to know what you're talking about any input is appreciated :)

dereckmezquita avatar Oct 28 '23 22:10 dereckmezquita

@dereckmezquita It's fairly complex to make database connection pool work with Rserve's fork mechanism. It's recommended to open and close connection each time you process request. Yes, there is an overhead, but again I the effort to even develop "workaround" solution (not covering all edge cases) is not trivial.

dselivanov avatar Oct 29 '23 07:10 dselivanov