node-export-server [Pupeteer] AWS Lambda EROFS in .cache

Currently when I'm trying to run the export server with AWS Lambda using the new enhancement/pupeteer branch, the generation of graph fails because it's trying to create a folder in /var/task/node_modules/highcharts-export-server/.cache

The AWS Lambda function is running with node18.x runtime and with the latest highcharts version. package.json version has been fixed as so: "highcharts-export-server": "github:highcharts/node-export-server#2a5d8df3d1e2f0b11a7984c55501507fd70d78a6" which has been suggested as the latest PR simplifies using it as a node module.

Expected behaviour

A very nice graph through AWS Lambda

Actual behaviour

Error message:

 {
    "errorType":"Error",
    "errorMessage":"EROFS: read-only file system, mkdir '/var/task/node_modules/highcharts-export-server/.cache'",
    "trace":[
       "Error: EROFS: read-only file system, mkdir '/var/task/node_modules/highcharts-export-server/.cache'",
       "    at mkdirSync (node:fs:1398:3)",
       "    at checkCache (file:///var/task/node_modules/highcharts-export-server/lib/cache.js:219:29)",
       "    at Object.initPool (file:///var/task/node_modules/highcharts-export-server/lib/index.js:57:11)",
       "    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)",
       "    at async createGraph (file:///var/task/functions/coingecko/chartGraph.mjs:13568:3)",
       "    at async Runtime.getSparkline [as handler] (file:///var/task/functions/coingecko/chartGraph.mjs:13650:17)"
    ]
 }

Sep 01 '23 07:09 TheTimeWalker

Thanks for reporting!

We will dive deeper into this issue after all the prioritized issues regarding server health are taken care of.

In the meantime, is there anything you could change in your AWS config for this to work? Is there any workaround you can take advantage of? Isn't it possible to give write permissions to certain directories?

Sep 01 '23 07:09 jszuminski

As far as I understood, the /var/task location is where the lambda code is located and is strictly read-only. The only location that is writeable is the /tmp folder. This is considered the scratch space and allows up to 500MB of data to be written. It could work if the .cache location of the export server can be manually defined to a different location and set to write cache data into /tmp

Sep 01 '23 08:09 TheTimeWalker

Okay, I understand. We should allow users to modify the writeable paths (.cache, tmp) with additional options.

For now, you can try modyfing this code: https://github.com/highcharts/node-export-server/blob/2a5d8df3d1e2f0b11a7984c55501507fd70d78a6/lib/cache.js#L29-L32

to change the cachePath manually in the code.

Sep 01 '23 09:09 jszuminski

Thanks for the tip! After changing that and getting hit with the same error because of the default file logging settings being set as log/, I was able to get one step further. However it seems like that just running puppeteer like that doesn't seem to work as you get the following error:

Could not find Chrome (ver. 115.0.5790.98). This can occur if either
 1. you did not perform an installation before running the script (e.g. `npm install`) or
 2. your cache path is incorrectly configured (which is: /home/sbx_user1051/.cache/puppeteer).
For (2), check out our guide on configuring puppeteer at https://pptr.dev/guides/configuration.
    at ChromeLauncher.resolveExecutablePath (file:///var/task/node_modules/puppeteer-core/lib/esm/puppeteer/node/ProductLauncher.js:274:27)
    at ChromeLauncher.executablePath (file:///var/task/node_modules/puppeteer-core/lib/esm/puppeteer/node/ChromeLauncher.js:174:25)
    at ChromeLauncher.computeLaunchArguments (file:///var/task/node_modules/puppeteer-core/lib/esm/puppeteer/node/ChromeLauncher.js:91:37)
    at ChromeLauncher.launch (file:///var/task/node_modules/puppeteer-core/lib/esm/puppeteer/node/ProductLauncher.js:53:39)
    at ChromeLauncher.launch (file:///var/task/node_modules/puppeteer-core/lib/esm/puppeteer/node/ChromeLauncher.js:45:22)
    at PuppeteerNode.launch (file:///var/task/node_modules/puppeteer-core/lib/esm/puppeteer/node/PuppeteerNode.js:139:31)
    at open (file:///var/task/node_modules/highcharts-export-server/lib/browser.js:113:35)
    at open (file:///var/task/node_modules/highcharts-export-server/lib/browser.js:123:17)
    at async open (file:///var/task/node_modules/highcharts-export-server/lib/browser.js:123:11)
    at async open (file:///var/task/node_modules/highcharts-export-server/lib/browser.js:123:11)

As it looks like, it can't find Chrome at all. I don't know if it's something with the location set as I'm a bit stumped here. There are packages like @sparticuz/chromium that help with running a Chromium instance in serverless deployment but I'm not sure how best I could wire this up with the configuration of highcharts-export-server

Sep 01 '23 11:09 TheTimeWalker

Thanks for sharing.

If changing the directories manually does not work then we'll have to dive deeper into this problem (but we'll do that after the prioritized server health issues are resolved).

Apologies for the inconvenience and I'll keep you posted here.

Sep 06 '23 09:09 jszuminski

Success! After a lot of trial and error, I've been able to come up to a point where I actually get a graph generated from a Lambda. Unfortunately it is quite messy, so I'm detailing the changes I did to achieve this. We (or your team) probably will need to discuss a good option to make this properly configurable.

I'll detail the changes below, but you can also compare the changed files with GitHub's compare tool

Change the cache directory

AWS Lambda allows you to use the /tmp as scratch space. Not only do we need to change the cachePath, the scriptTag location also has to follow that point for the sources.js` file. Change the following code to use tmp instead:

lib/cache.js:31

const cachePath = '/tmp/.cache';

lib/browser.js:75

await p.addScriptTag({ path: '/tmp/.cache/sources.js' });

Pass over a chromium instance from @sparticuz/chromium

With this library, it's easily possible to either push a chromium binary as a dependency, or even set it as a dev dependency and leverage Lambda layers to instead use the binary from an S3 bucket. To make it a bit easier on myself, I have changed the following places to pass chromium via the initPool function:

lib/browser.js:98

export const create = async (puppeteerArgs, chromium) => {

lib/index.js:34

initPool: async (options = {}, chromium) => {

lib/pool.js:130

export const init = async (config, chromium) => {

lib/pool.js:136

await createBrowser(puppeteerArgs, chromium);

Use the passed Chromium in puppeteer.launch and remove the user data folder. This allows puppeteer to instead use the passed Chromium instance. For the user data folder, I'm not exactly sure what it does, but it seems like it doesn't complain anything when removed in the default configuration. Technically, we could reposition it to /tmp as well.

lib/browser.js:113

        browser = await puppeteer.launch({
          headless: 'new',
          args: allArgs,
          defaultViewport: chromium.defaultViewport,
          executablePath: await chromium.executablePath(),
        });

You can pass the serverless args of the chromium instance directly with the pool's puppeteer arguments:

await exporter.initPool(
  {
    puppeteer: { args: chromium.args },
    pool: { maxWorkers: 2, logging: undefined },
  },
  chromium
);

Make the rasterization timeout longer. For some reason, it looks like when trying to render PNG, it takes more than 1.5s to render which makes it normally fail in the current implementation. I have temporarily upped it to a very high number just to be sure:

lib/export.js:65

setTimeout(() => reject(new Error('Rasterization timeout')), 10000)

Disable logging so that it doesn't use the toFile feature. The default toFile destination is set to a place where it's not allowed to write making it fail. Either set the toFile to undefined, to a different location like /tmp or outright remove it from lib/logger.js

Doing all these changes properly creates a screenshot of the graph and returns it as a base64 string. I haven't tried PDF or SVG exporting yet, but I assume they should work fine as well.

Sep 29 '23 16:09 TheTimeWalker

Wow, that's great! Thanks for sharing.

Sep 30 '23 09:09 jszuminski

await exporter.initPool(
  {
    puppeteer: { args: chromium.args },
    pool: { maxWorkers: 2, logging: undefined },
  },
  chromium
);

HI! By any chance, have you run into a such error ?
`

[browser] failed: Error: Failed to launc at Process.ChildProcess._handle.onexit (node:internal/child_process:293:12)lib/esm/launch.js:262:24) /run/dbus/system_bus_socket: No such file or directory`

Trying to pack it into lambda, following your instructions & git diff. But this one seems not going away

Nov 18 '23 18:11 advissor

Hey! I haven't looked into how it works on the latest commit because I hadn't had the time to update my fork with these changes. The last commit that I'm using is 2a5d8df3d1e2f0b11a7984c55501507fd70d78a6 so it's possible there was some change later on that breaks this

Nov 18 '23 19:11 TheTimeWalker

Hello! I'm trying to use the highcharts-export-server package version 3.1.1 in an AWS lambda, but running into the EROFS: read-only file system, mkdir '/var/task/node_modules/highcharts-export-server/.cache' error.

Note that I'm not seeing this error when using version 2.1.0 in an AWS lambda. I was very excited to try upgrading, since my understanding is that the deprecated PhantomJS dependency was finally replaced (🎉). But I can't currently upgrade with this error.

I'm not super interested in keeping my own fork of the package to override the cache directory path, for now.

Is there any known workaround for this error yet? Thanks!

Feb 27 '24 17:02 dylan-stankievech

Good evening! I'm trying to use the highcharts-export-server package version 4.0.2 in an AWS lambda, but also seeing ERROR Error loading highcharts-export-server: Error: EROFS: read-only file system, mkdir '/var/task/node_modules/highcharts-export-server/.cache' Are there any updates on this?

Oct 17 '24 20:10 wild-arctic-fox