shopify-app-store-scraper
shopify-app-store-scraper copied to clipboard
Crawler behind the Shopify App Marketplace dataset
Shopify App Store scraper
About
Here you can find the code which scrapes and saves data from the Shopify App Store.
The scraper is used to collect Shopify app store dataset on Kaggle and includes these files:
appsapps_categoriescategorieskey_benefitspricing_plan_featurespricing_plansreviews
While the dataset published on Kaggle is regularly updated, this repository allows keeping the local copy up to date independently of the released version.
Detailed dataset description can be found here.
How to use it
Docker (recommended)
Authenticate to GitHub Container Registry (if not already)
docker login ghcr.io -u USERNAME -p TOKEN
Pull container
docker pull ghcr.io/usernam3/shopify-app-store-scraper
Run container
docker run -v `pwd`/output/:/app/output/ ghcr.io/usernam3/shopify-app-store-scraper
After container finished the execution check the output folder (in current directory)
ls -la output/
Python
Install requirements
pip install -r requirements.txt
Run scraper
scrapy crawl app_store
After container finished the execution check the output folder (in current directory)
ls -la output/
Please don't hesitate to open issues or PRs at any time if you need help with anything.