arachnado
arachnado copied to clipboard
Add documentation and examples for custom spider usage
The issue "How to run custom spider?" lacked documentation. While Arachnado supports custom Scrapy spiders via spider_packages config and spider:// URL format, this was undocumented.
Changes
Documentation (docs/custom-spiders.rst)
- Complete guide on creating and configuring custom spiders
- Usage via
spider://spidernameURL format - Configuration through
spider_packagessetting -
ArachnadoSpiderinheritance for accessingdomain,crawl_id,motor_job_id - Docker setup and troubleshooting
Examples (examples/custom_spiders/)
-
ExampleCustomSpider: Full-featured spider with custom arguments and data extraction -
SimpleSpider: Minimal implementation withoutArachnadoSpiderinheritance - Setup instructions and usage guide
UI Enhancement (arachnado/templates/help.html)
- In-app instructions with code examples
- Configuration snippets
- Link to full documentation
Other
- Updated
README.rstwith custom spider section - Added examples to
MANIFEST.in - Documentation reference in
defaults.conf
Usage
Create a spider:
from arachnado.spider import ArachnadoSpider
class MySpider(ArachnadoSpider):
name = 'mycustom'
def parse(self, response):
yield {'url': response.url}
Configure ~/.arachnado.conf:
[arachnado.scrapy]
spider_packages = myspiders.spiders
Trigger via UI or API:
spider://mycustom
Original prompt
This section details on the original issue you should resolve
<issue_title>How to run custom spider?</issue_title> <issue_description>How to run custom spider? </issue_description>
Comments on the Issue (you are @copilot in this section)
- Fixes TeamHG-Memex/arachnado#15
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.