arachnado icon indicating copy to clipboard operation
arachnado copied to clipboard

Add project upload endpoint for running custom Scrapy spiders

Open Copilot opened this issue 5 months ago • 0 comments

Enables users to upload Scrapy project archives and run custom spiders through the WebUI, eliminating the need for manual spider configuration.

Changes

Backend:

  • ProjectManager class handles project extraction (zip/tar.gz), spider package discovery via scrapy.cfg parsing, and lifecycle management
  • /project/upload endpoint validates project names (alphanumeric + _-), extracts archives, and registers spider packages
  • /project/list endpoint returns uploaded projects
  • DomainCrawlers modified to dynamically load spider packages from uploaded projects on startup and on-demand
  • Configuration adds uploaded_projects_dir option (opt-in)

Frontend:

  • ProjectUpload React component provides collapsible upload form with validation and status feedback
  • Integrated into main page between crawl form and jobs panel

Documentation:

  • API endpoints documented in docs/http-api.rst
  • User guide added to README.rst and help page

Usage

Upload via WebUI or API:

curl -X POST \
  -F "project_name=myproject" \
  -F "[email protected]" \
  http://localhost:8888/project/upload

Run uploaded spiders using spider://spider_name in crawl form. Projects persist across restarts.

Tests

6 unit tests cover extraction, discovery, listing, and error handling. CodeQL analysis shows no vulnerabilities.

Original prompt

This section details on the original issue you should resolve

<issue_title>Need to add an option to upload the scrapy project and run the spider by arachnado </issue_title> <issue_description>Hi all: It may be difficult for user to config the spider written by themselves ,i think if we create an option then just need to upload the spider they have written , and run by the WebUI ,it will become a very useful tool . </issue_description>

Comments on the Issue (you are @copilot in this section)

  • Fixes TeamHG-Memex/arachnado#13

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot avatar Nov 18 '25 01:11 Copilot