actor-templates icon indicating copy to clipboard operation
actor-templates copied to clipboard

Advice on how to run Scrapy template without using __main__.py

Open honzajavorek opened this issue 1 year ago • 2 comments

I'd like my project to have several CLI commands, e.g. using click. It's not entirely clear to me how to properly move the __main__.py logic somewhere else, so that I can wrap running the actor/spider into one command and have other commands doing other stuff.

Vaguely related to https://github.com/apify/apify-sdk-python/issues/176

honzajavorek avatar Jan 16 '24 12:01 honzajavorek

I did some changes in my implementation so that it's possible:

  • Introduced CLI where only the logging setup is hoisted before everything else: https://github.com/juniorguru/plucker/blob/6fe0c31097b00339cbc05b2ab40fc1dae23160bd/juniorguru_plucker/cli.py
  • Moved all logging setup to a separate file: https://github.com/juniorguru/plucker/blob/main/juniorguru_plucker/loggers.py

Feel free to grab inspiration from what I did, or even chunks of code (MIT licensed, just mention my name). I guess I've solved this for myself now. If there are updates to the Scrapy template, I hope I'll be able to somehow keep up with it and backport changes to my highly customized project.

honzajavorek avatar Jan 17 '24 10:01 honzajavorek

Hi Honza, thank you for opening this. Moving as much code as we can from the template to the SDK is definitely a good way to go. Unfortunately, adding new features to our Scrapy-Apify integration is not a priority for this quarter, so I cannot promise I'll have time to take a look at this in the near future.

vdusek avatar Jan 19 '24 07:01 vdusek

I believe this issue has been resolved in apify-sdk-python#390 and actor-templates#311, where all applicable code was moved to the SDK. Feel free to reopen if you have any further suggestions.

vdusek avatar Mar 20 '25 10:03 vdusek