actor-templates icon indicating copy to clipboard operation
actor-templates copied to clipboard

Consider short run optimizations

Open metalwarrior665 opened this issue 1 month ago • 14 comments

I finalized my recommendations to improve the speed of short runs in Notion.

Each of the optimizations will have some added complexity cost. For some it might be still ok to implement in templates while for the rest we should create a dedicated guide.

Some potential candidates:

  1. npm start -> node dist/main.js
  2. crawlee -> @crawlee/cheerio
  3. Actor.createProxyConfiguration({ checkAccess: false })

metalwarrior665 avatar Nov 13 '25 10:11 metalwarrior665

Actor.createProxyConfiguration({ checkAccess: false })

Do you think this is safe? I recall you said you don't need the check for apify actors, but this will affect all users, so maybe its fine to keep it?

B4nan avatar Nov 18 '25 12:11 B4nan

Do you think this is safe? I recall you said you don't need the check for apify actors, but this will affect all users, so maybe its fine to keep it?

I think it is reasonably safe but if it fails, the UX will be a bit worse. Instead of crashing at that call, you will run through the 3 retries with an uglier error.

  1. If it is your own Actor, you are not that likely to use a proxy config that doesn't work. The biggest issue is if you don't have access to the proxy itself which is the case for free users that run outside of Apify.
  2. If it is public Actor, you usually don't let users configure. If you do, the editor will only show usable groups but API doesn't check this.

A compromise is to keep the old code but add a comment above

metalwarrior665 avatar Nov 18 '25 14:11 metalwarrior665

npm start -> node dist/main.js

What do you mean by this? I think that we advice people to use apify run

patrikbraborec avatar Nov 19 '25 15:11 patrikbraborec

The docker images always use npm start, which adds overhead over running the actual script with node. We don't use the CLI for production deployments (and they don't really need apify run locally either nowadays).

B4nan avatar Nov 19 '25 15:11 B4nan

What do you mean by this? I think that we advice people to use apify run

The best is to do apify run --entrypoint=path. This way CLI still injects all env vars but you are not bound by npm start.

metalwarrior665 avatar Nov 19 '25 15:11 metalwarrior665

and they don't really need apify run locally either nowadays

I do not understand - so what is the way on how to start actors locally?

patrikbraborec avatar Nov 20 '25 15:11 patrikbraborec

Actors are regular programs, you run them as such. For nodejs, you run it via node (or some task runner, e.g. via npm scripts - so npm start), for python you would use python directly (or some task runner like uv). apify run is just a wrapper over npm start in nodejs (and over python binary in python). Back in the day, apify run did a lot of things, but we've come a long way and moved a lot to the SDK or crawlee (like inferring input defaults from input schema), or even changed default behaviour (auto purging of default storages), so you don't really need apify run for much nowadays (I honestly don't know what you would need it for, I guess some env vars are set by it, but nothing I would personally have had to use over the years, I only use the CLI to create projects from template and push them to platform).

B4nan avatar Nov 20 '25 15:11 B4nan

It is mostly for proxy password env var injection

metalwarrior665 avatar Nov 20 '25 16:11 metalwarrior665

You don't need apify run for that, the SDK reads APIFY_PROXY_PASSWORD and uses it automatically:

https://github.com/apify/apify-sdk-js/blob/master/packages/apify/src/configuration.ts#L159 https://github.com/apify/apify-sdk-js/blob/master/packages/apify/src/proxy_configuration.ts#L214

B4nan avatar Nov 20 '25 16:11 B4nan

But SDK is not gonna get it from your encrypted file system

metalwarrior665 avatar Nov 20 '25 16:11 metalwarrior665

I have APIFY_PROXY_PASSWORD in my env, and the SDK simply reads that, I've been using that for years. It's an env var, not a file, right? What am I missing?

B4nan avatar Nov 20 '25 16:11 B4nan

So you probably created that env manually, right? But apify run does automatically inject token & proxy env vars from your global home .apify folder (it is not encrypted as I thought, though). This is created on apify login

Image

.

metalwarrior665 avatar Nov 21 '25 07:11 metalwarrior665

I was asking more about user perspective. If user runs apify create, etc. - basically he/she uses Apify CLI, I think that the convenient way is to still use apify run to run actors locally, or not?

patrikbraborec avatar Nov 21 '25 08:11 patrikbraborec

The suggestion here is about the production dockerfile usage, not about how users do anything, just a perf optimization for production deployment.

B4nan avatar Nov 21 '25 08:11 B4nan

Regarding the checkAccess, I ended up adding this to the templates explicitly set to true with a comment:

// For short runs, you might want to disable the `checkAccess` flag, which ensures the proxy credentials are valid.
const proxyConfiguration = await Actor.createProxyConfiguration({ checkAccess: true });

B4nan avatar Dec 09 '25 13:12 B4nan