arachnado icon indicating copy to clipboard operation
arachnado copied to clipboard

Add documentation and examples for custom spider usage

Open Copilot opened this issue 5 months ago • 0 comments

The issue "How to run custom spider?" lacked documentation. While Arachnado supports custom Scrapy spiders via spider_packages config and spider:// URL format, this was undocumented.

Changes

Documentation (docs/custom-spiders.rst)

  • Complete guide on creating and configuring custom spiders
  • Usage via spider://spidername URL format
  • Configuration through spider_packages setting
  • ArachnadoSpider inheritance for accessing domain, crawl_id, motor_job_id
  • Docker setup and troubleshooting

Examples (examples/custom_spiders/)

  • ExampleCustomSpider: Full-featured spider with custom arguments and data extraction
  • SimpleSpider: Minimal implementation without ArachnadoSpider inheritance
  • Setup instructions and usage guide

UI Enhancement (arachnado/templates/help.html)

  • In-app instructions with code examples
  • Configuration snippets
  • Link to full documentation

Other

  • Updated README.rst with custom spider section
  • Added examples to MANIFEST.in
  • Documentation reference in defaults.conf

Usage

Create a spider:

from arachnado.spider import ArachnadoSpider

class MySpider(ArachnadoSpider):
    name = 'mycustom'
    
    def parse(self, response):
        yield {'url': response.url}

Configure ~/.arachnado.conf:

[arachnado.scrapy]
spider_packages = myspiders.spiders

Trigger via UI or API:

spider://mycustom
Original prompt

This section details on the original issue you should resolve

<issue_title>How to run custom spider?</issue_title> <issue_description>How to run custom spider? </issue_description>

Comments on the Issue (you are @copilot in this section)

  • Fixes TeamHG-Memex/arachnado#15

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot avatar Nov 18 '25 01:11 Copilot