Memory leak in rails apps using Puppeteer.connect(browser_ws_endpoint: '...') do |browser| .. end
Step To Reproduce / Observed behavior
Using puppeteer in a Rails 7.2.1 application with external browserless Chrome container connection with Puppeteer.connect(..) do |browser| .. end. Memory usage slowly creeps up. When built into a Docker image, any hard limit will eventually be hit despite the ruby VM trying to garbage collect. I am "ensure"ing a browser.close and browser.disconnect within the block. Here's the exact block...
Puppeteer.connect(browser_ws_endpoint: ENV['WEBSOCKET_CHROME_URL']) do |browser|
Rails.logger.debug "Attempting to capture screenshot of: + #{uri}"
begin
page = browser.new_page
page.viewport = Puppeteer::Viewport.new(width: 1280, height: 1280)
page.goto(uri.to_s, timeout: 5000) # , wait_until: 'domcontentloaded')
self.http_screenshot = page.screenshot
rescue StandardError => e
# Errors can be thrown due to a number of things: DNS, timeout, etc.
Rails.logger.debug 'Failed to capture screenshot.'
Rails.logger.debug e
ensure
Rails.logger.debug 'Closing browser.'
browser.close
browser.disconnect
end
end
Expected behavior
Memory to remain fairly stable.
Environment
macOS with rvm
Paste the output of ruby --version
ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [arm64-darwin23]
Also note that the method running this is within an ActiveRecord model class. I don't think that should matter.. unless it does. :)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This is still an big issue.
I have the same issue with a very similar setup except using Puppeteer.launch
Puppeteer.launch(headless: headless, args: args) do |browser|
page = browser.pages.first || browser.new_page
# Puppeteer logic
rescue => e
# timeout issues, etc.
ensure
browser.pages.each { |pg| pg.close unless pg.closed? }
# Other cleanup handled by the Puppeteer.launch ensure block
end
I'm using ruby 3.3.5 (2024-09-03 revision ef084cc8f4) [aarch64-linux]
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I'm having the same issue
I was able to work around this issue by using a child process (not ideal but it works). I run my puppeteer code in a delayed job process that runs inside a docker container. This approach will put the memory leaking issue into an isolated process so it can be properly cleaned up when it finishes.
def using_puppeteer(headless: true, args: DEFAULT_PUPPET_ARGS)
file = Tempfile.new(SecureRandom.hex(10).to_s, Rails.root.join('tmp'))
pid = Process.fork do
Puppeteer.launch(headless: headless, args: args) do |browser|
page = browser.pages.first || browser.new_page
# Do whatever you need with puppeteer using a block
puppet_result = yield(browser, page)
file.write(puppet_result.to_json) if puppet_result.present? && puppet_result.respond_to?(:to_json)
rescue => e
# Do whatever on exception
# Store exception so it can be given to the worker process
file.write("Exception: #{e.message}")
ensure
file.flush
# Do any cleanup operations on the page
browser.close
file.close unless file.closed?
exit(0)
end
end
Process.wait(pid)
# Use the tempfile in the main process to handle whatever was returned by the puppeteer process
file.rewind
# Get whatever the puppet process returned, if anything
result_from_puppet = file.read
return if result_from_puppet.blank?
raise(result_from_puppet.split('Exception: ').last) if result_from_puppet.include?('Exception: ')
JSON.parse(result_from_puppet)
end
Then it can be used like
using_puppeteer(headless: headless) do |_browser, page|
# Do whatever you need with puppeteer. Memory will be cleaned up after use
end
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.