plumber icon indicating copy to clipboard operation
plumber copied to clipboard

Plumber is not releasing memory

Open apalacio9502 opened this issue 1 year ago • 11 comments

Hello,

We have detected that plumber is not releasing the memory when a query ends, for example, at this point we understand that there are two possible problems with the memory.

  • The first is when R has released the memory but the process memory footprint doesn't go down because the allocator doesn't return unused pages to the OS.
  • The second one is when R memory usage is growing because some objects are not released in a specific case.

We solved the first problem by implementing the solution proposed in this comment https://github.com/rstudio/plumber/issues/496#issuecomment-541402503

We have tried to solve the second problem by adding a postserialize hook that will run gc(). This solution generated a significant change because it frees the objects used in the API. However, we have detected that the memory consumed by the file that is being delivered to the user is not released since the postserialize occurs before the delivery of information to the user is completed.

We have detected this problem because in our case we always have an active process to attend to the API requests. If we did not have an always active process this problem would not be a concern.

Below I leave the code that we have used to test the problem, with the postserialize solution.

library(plumber)
library(readr)

#* @apiTitle Plumber Test Memory Leak

#* Download data
#* @get /download_data
#* @serializer csv
function() {
data <- data.frame(
COLUMN_1 = rep("COLUMN_1",1e+8),
COLUMN_2 = rep("COLUMN_2",1e+8),
COLUMN_3 = rep("COLUMN_3",1e+8),
COLUMN_4 = rep("COLUMN_4",1e+8),
COLUMN_5 = rep("COLUMN_5",1e+8)
)
return(data)
}

#* @plumber
function(pr) {
  pr %>%
    pr_hook("postserialize", function(req) {
      if (req$PATH_INFO != "/" & req$PATH_INFO != "/openapi.json") {
        later::later(function() {
          gc(reset = TRUE, full = TRUE)
        }, delay = 0)
      }
    })
}

I think this topic becomes relevant when we talk about APIs in a productive environment with high demand (hosted in posit connect for example).

Regards,

apalacio9502 avatar Dec 06 '23 04:12 apalacio9502

I can not definitively give fault to either R, plumber, readr, or any other package needed for execution (httpuv, R6, etc.).

Following https://memtools.r-lib.org/articles/memtools.html#comparing-snapshots-to-detect-leaked-objects , I did not find any evidence to support that there were any leaked objects.

I was not able to follow https://memtools.r-lib.org/articles/memtools.html#snapshotting-the-precious-list as it required more lldb skills than I have time.

Using your simple reprex app, if you are able to point in a definitive direction, I am happy to pursue code changes in any of the R packages in the execution stack.

schloerke avatar Dec 11 '23 22:12 schloerke

Hello @schloerke,

Thanks for your answer.

I was trying to do the tests with memtools and when I tried to add mem_snapshot inside of the plumber function it gave me the following error! Internal error in r_vec_deref_const(): Unimplemented type expression. Surely I'm doing something wrong, you have an example of how memtools can be used with plumber to be able to do tests and try to find the cause of the problem I'm mentioning.

Regards,

apalacio9502 avatar Dec 12 '23 22:12 apalacio9502

hello @schloerke and @apalacio9502

Hi I am having the same issue here. I have the idea that the issue is correlated with this issue: RestRserve

GreenGrassBlueOcean avatar Jan 09 '24 21:01 GreenGrassBlueOcean

Our team has observed similar patterns recently with a docker-based deployment. The screenshot below from Grafana summarizes recent memory usage for a plumber API that serves predictions for 2 different models.

image

After deployment, the service ran at ~330 MiB. We then submitted a large request to one of the models which caused memory to spike as expected. We waited a few hours before submitting a single request to the other model endpoint. Memory dropped as expected. However, something is still being held in memory because the service is now using ~464 MiB. We've already implemented the change to jemalloc per the recommendation in Issue 496. We've also added gc() at the start and just before the return() statement for each model endpoint.

Our questions, therefore, are the following:

  • Is this pattern expected? Or, should memory drop to essentially what it was at deployment? We assume it should be the latter option, but we wanted to confirm. We were somewhat surprised to see memory stabilize with our testing at ~464 MiB because of the gc() statements we added and the change to jemalloc.
  • We are currently using memuse::Sys.procmem()$size and lobstr::obj_size() to capture total memory usage and individual object sizes, respectively. We are also using memuse::Sys.cachesize(). Are there alternative libraries we should be using to profile memory consumption?

mitch-tanney-coterie avatar Jan 18 '24 15:01 mitch-tanney-coterie

I have been studying this a little further and I noticed that when using the future.callr package the memory accumulation decreases. My application performed significantly better and did not crash anymore Maybe you can check this and let me know if you @mitch-tanney-coterie also observe this?

Please find below a minimal example. Each endpoint should have a similar future promise construction.

#
# This is a Plumber API. You can run the API by clicking
# the 'Run API' button above.
#
# Find out more about building APIs with Plumber here:
#
#    https://www.rplumber.io/
#
library(plumber)
library(future)
library(promises)
library(future.callr)


future.seed = TRUE
future::plan(future.callr::callr())
print("planned session future.callr::callr")

#* Echo back the input
#* @param msg The message to echo
#* @get /echo
function(msg = "") {
  promises::future_promise({
   message(paste0("The message is: '", msg, "'"))
   msg = paste0("The message is: '", msg, "'")
   }, seed = TRUE)
}

GreenGrassBlueOcean avatar Jan 18 '24 15:01 GreenGrassBlueOcean

@GreenGrassBlueOcean, thank you for the recommendation, and I apologize for the delayed reply here. Since my last post, we have continued to monitor our API in a test environment where memory seems to be stable at ~ 465 MiB. Before we complete any additional updates, I have several follow-up questions for you (no immediate rush whatsoever).

  • First, did you implement future.callr with or without the change to jemalloc? Our Dockerfile currently includes the commands as they were recommended here.

  • Also, I'm not very familiar with future.callr, but why do you think the change worked for your use case? Were you experiencing similar patterns as I described above? Any additional detail you can share about the use case and how future.callr solved that problem would be greatly appreciated.

mitch-tanney-coterie avatar Jan 26 '24 21:01 mitch-tanney-coterie

Hi @GreenGrassBlueOcean and @mitch-tanney-coterie,

I haven't tried the solution proposed by @GreenGrassBlueOcean to implement future.callr. However, it makes sense to me that this could solve the problem, as when using callr futures, each future is resolved in a fresh background R session that ends as soon as the future's value has been collected. Nevertheless, I believe this solution might not be the most appropriate, as creating a background process each time an API request is made could potentially cause unnecessary delays in the API.

I believe that the initial solution implementing jecmalloc and a postserializer performing garbage collection is more appropriate. However, the issue of the memory consumed by the data delivered to users, which is not freed, still remains

Regards,

apalacio9502 avatar Jan 29 '24 21:01 apalacio9502

hi @apalacio9502 ,

I just noticed your response. I had at first the very same concern. But please try it out with the echo example from above. I was not able to notice a noticeable difference in response time.

What however is an issue is that just as with parallell r-sessions you have to be very specific what options, variables and packages are available to the session. So you have to very carefully make sure that your code is in line with the recommendation by the future package author at: vignette: A Future for R: Common Issues with Solutions

GreenGrassBlueOcean avatar Jan 30 '24 09:01 GreenGrassBlueOcean

hi @mitch-tanney-coterie,

I just noticed your post as well. I can not share much about the use case. unfortunately. But it is a simple rest application that gets a command makes a calculation in r and returns the response.

I noticed that plumber was not able to keep up with multiple calls at the same time. Furthermore I noticed that during it's 24 hours run (I always let the application run for max 24 hours) It stopped working. So for 20 hours it was doing what it should be doing (as long as it was not called at the same time) and then it suddenly stopped (most likely by a memory issue, as it began showing errors about r temp files being not available).

I then switched over to future.callr in combination with promises and my issues disappeared like magic. Now already running the application for 3 weeks now without a single issue. (like magic)

I noticed you run a docker image. For this application I don't run it in docker (because of the nature of this application) but on native windows.

However for docker I would recommend:

  1. test it in a r-session on your local pc, see if your application works as planned (options, variables, packages as described in the vignette in my previous post to @apalacio9502 )
  2. Create your own docker image from R. I use this one : https://github.com/openanalytics/r-base. You can easily rewrite this to the current r-version.
  3. Build a testing docker to see if it builds and then try to run your application. if you get a build error try to change packages and dependencies, play around a couple of times. Eventually it will build and work.
  4. Monitor the building of your application in something like Jenkins, apache airflow or github actions.

From my experience with my docker images this is generally the best way to built these.

GreenGrassBlueOcean avatar Jan 30 '24 10:01 GreenGrassBlueOcean

This has been a problem for me and happy to say its resolved thanks to this thread. I'll share my experience below.

Added the following to the plumber Dockerfile:

RUN apt-get update && apt-get install -y --no-install-recommends libssl-dev libjemalloc-dev
ENV LD_PRELOAD /usr/lib/x86_64-linux-gnu/libjemalloc.so

Added the following to plumber.R:

plan(future.callr::callr()) # <-- this used to be plan(multisession)

With these two changes, I finally see the memory released back to the node:

image

tylerlittlefield avatar Feb 23 '24 23:02 tylerlittlefield

PR Request:

Add FAQ entry (or small vignette) suggesting linux users to use:

  • libjemalloc-dev (comment)
  • {future.callr} plan when executing work with {future}. (Not plan(multisession)).

The instructions / graphic from https://github.com/rstudio/plumber/issues/938#issuecomment-1962131908 could almost be taken verbatim. (Thank you, @tylerlittlefield !)

schloerke avatar Feb 26 '24 15:02 schloerke