Consider short run optimizations
I finalized my recommendations to improve the speed of short runs in Notion.
Each of the optimizations will have some added complexity cost. For some it might be still ok to implement in templates while for the rest we should create a dedicated guide.
Some potential candidates:
- npm start -> node dist/main.js
crawlee->@crawlee/cheerioActor.createProxyConfiguration({ checkAccess: false })
Actor.createProxyConfiguration({ checkAccess: false })
Do you think this is safe? I recall you said you don't need the check for apify actors, but this will affect all users, so maybe its fine to keep it?
Do you think this is safe? I recall you said you don't need the check for apify actors, but this will affect all users, so maybe its fine to keep it?
I think it is reasonably safe but if it fails, the UX will be a bit worse. Instead of crashing at that call, you will run through the 3 retries with an uglier error.
- If it is your own Actor, you are not that likely to use a proxy config that doesn't work. The biggest issue is if you don't have access to the proxy itself which is the case for free users that run outside of Apify.
- If it is public Actor, you usually don't let users configure. If you do, the editor will only show usable groups but API doesn't check this.
A compromise is to keep the old code but add a comment above
npm start -> node dist/main.js
What do you mean by this? I think that we advice people to use apify run
The docker images always use npm start, which adds overhead over running the actual script with node. We don't use the CLI for production deployments (and they don't really need apify run locally either nowadays).
What do you mean by this? I think that we advice people to use apify run
The best is to do apify run --entrypoint=path. This way CLI still injects all env vars but you are not bound by npm start.
and they don't really need apify run locally either nowadays
I do not understand - so what is the way on how to start actors locally?
Actors are regular programs, you run them as such. For nodejs, you run it via node (or some task runner, e.g. via npm scripts - so npm start), for python you would use python directly (or some task runner like uv). apify run is just a wrapper over npm start in nodejs (and over python binary in python). Back in the day, apify run did a lot of things, but we've come a long way and moved a lot to the SDK or crawlee (like inferring input defaults from input schema), or even changed default behaviour (auto purging of default storages), so you don't really need apify run for much nowadays (I honestly don't know what you would need it for, I guess some env vars are set by it, but nothing I would personally have had to use over the years, I only use the CLI to create projects from template and push them to platform).
It is mostly for proxy password env var injection
You don't need apify run for that, the SDK reads APIFY_PROXY_PASSWORD and uses it automatically:
https://github.com/apify/apify-sdk-js/blob/master/packages/apify/src/configuration.ts#L159 https://github.com/apify/apify-sdk-js/blob/master/packages/apify/src/proxy_configuration.ts#L214
But SDK is not gonna get it from your encrypted file system
I have APIFY_PROXY_PASSWORD in my env, and the SDK simply reads that, I've been using that for years. It's an env var, not a file, right? What am I missing?
So you probably created that env manually, right? But apify run does automatically inject token & proxy env vars from your global home .apify folder (it is not encrypted as I thought, though). This is created on apify login
.
I was asking more about user perspective. If user runs apify create, etc. - basically he/she uses Apify CLI, I think that the convenient way is to still use apify run to run actors locally, or not?
The suggestion here is about the production dockerfile usage, not about how users do anything, just a perf optimization for production deployment.
Regarding the checkAccess, I ended up adding this to the templates explicitly set to true with a comment:
// For short runs, you might want to disable the `checkAccess` flag, which ensures the proxy credentials are valid.
const proxyConfiguration = await Actor.createProxyConfiguration({ checkAccess: true });