crawlee icon indicating copy to clipboard operation
crawlee copied to clipboard

docs(contributing): add setup instructions

Open SalvadorN323 opened this issue 4 months ago • 3 comments

Summary
This PR adds detailed setup and build instructions to help contributors initialize the Crawlee project locally. It documents required dependencies, Yarn installation via Corepack, and guidance on using yarn build successfully.

Key Changes
Documentation Additions:

  • Added Crawlee Project Pre-requisites section listing required Node.js and Yarn versions.
  • Included a Crawlee Installation and Building guide with Corepack instructions and yarn commands.

Important Notes:

  • These updates aim to streamline the developer onboarding and build process.

Contributors:

  1. Salvador Nunez: @SalvadorN323
  2. Alexander Manalad: @axmanalad
  3. Bao Truong: @baotruong04

SalvadorN323 avatar Jul 31 '25 21:07 SalvadorN323

this sounds like something to fix instead of documenting it. can you describe what exactly did happened? rimraf dist should work just fine even if the folder is not present, our CI would fail if that would be the case.

Hi @B4nan,

Thanks for reviewing! Each of us ran into a similar build issue when we attempted to rebuild the project with any change in general with an error related to gen-esm-wrapper not finding the index.js in the dist folder. For instance, if I only insert a console.log(“Hello World"); line inside the core package TS file (like enqueue_links.ts), running yarn build in the root project would miss the core build cache and either never build the dist folder or it would be incomplete.

I also thought that rimraf ./dist should work as expected, since it does delete it in the frontend. I believe it had to do something with the direction of the path, rather yarn compile points to the deleted dist folder possibly? With this in mind, it can also mean that rimraf ./dist does not delete the dist folder fully during compile time?

However, another fix I found working but would always include more steps was the the following:

  1. When receiving the error, change the current directory into the directory of the error occurring (e.g. packages/core)
  2. Run yarn build
  3. Change the directory back to the root project.
  4. Run yarn clean and yarn build.

I could always ticket a new issue with the error log included if you are interested. When we found the rimraf fix, we did not know whether to include it as a potential change in the codebase.

axmanalad avatar Aug 04 '25 21:08 axmanalad

Thanks for reviewing! Each of us ran into a similar build issue when we attempted to rebuild the project with any change in general, with an error related to gen-esm-wrapper not finding the index.js in the dist folder. For instance, if I only insert a console.log(“Hello World"); line inside the core package TS file (like enqueue_links.ts), running yarn build in the root project would miss the core build cache and either never build the dist folder or it would be incomplete.

This feels like something weird happened on your end, and you are trying to randomly find the culprit (so which one is it, gen-esm-wrapper, tsc build, build not working at all, or being incomplete?). I kinda doubt there is an issue like this (if there is, it would have to be in one of the libraries like tsc or turbo).

I'd need to see a complete reproduction - exact steps, not "either that or that happened, or maybe that". Right now, I am not convinced we need to update the contributing guide. Your changes there could likely confuse people rather than help them.

Reading this again and again, I actually think I know what is happening to you, it sounds the tsc build cache, which wasn't properly ignored some time ago (and we managed to include one tsbuildInfo file in the git). We fixed that already via #3035, maybe you just faced that issue because you cloned the project earlier.

B4nan avatar Aug 05 '25 07:08 B4nan

I updated the documentation to not include the fix.

This feels like something weird happened on your end, and you are trying to randomly find the culprit (so which one is it, gen-esm-wrapper, tsc build, build not working at all, or being incomplete?). I kinda doubt there is an issue like this (if there is, it would have to be in one of the libraries like tsc or turbo).

Reading this again and again, I actually think I know what is happening to you, it sounds the tsc build cache, which wasn't properly ignored some time ago (and we managed to include one tsbuildInfo file in the git). We fixed that already via #3035, maybe you just faced that issue because you cloned the project earlier.

The project was tested and cloned after the fix you mentioned. Even attempting to run the normal steps would result with the same error either way. The normal steps with a fresh start would include:

  1. Had corepack enable set up.
  2. Run yarn install
  3. Run yarn build (success)
  4. Add console.log("Hello World"); in line 520 of enqueue_links.ts in the core package.
  5. Run yarn build with the new code change (fails)

You are correct however that it is tied to a local issue of mine as I tried many checks of the following:

  • Uninstalled my global versionings of TypeScript and Turbo.
  • Every versioning including Node.js and Yarn are correct.
  • Using yarn clean to clear cache.
  • Deleting node_modules and reinstalling with yarn install
  • Deleting the generated tsconfig.build.tsbuildinfo manually (somehow this works)

I once again attempted today to do the normal steps of rebuilding the project. You are also correct that it has to do something with TypeScript's incremental build cache in my local environment; it has to do something with tsconfig.build.tsbuildinfo being out of sync or being corrupted afterwards? In other words, tsconfig.build.tsbuildinfo is not updating for me whenever I make a new build with new code changes weird enough, which forces me to delete it manually. Unfortunately, I am not sure where the source of the problem is regarding the "out of sync" issue as it is somehow not an easy fix to become automatic locally. If you would like to look into the log however, feel free to do so with the document I attached. log.txt

axmanalad avatar Aug 05 '25 19:08 axmanalad