inkscraper icon indicating copy to clipboard operation
inkscraper copied to clipboard

Linkedin Jobs Scraper using nodejs,expressjs and mongodb as storage

inkscraper

Build Status Dependencies Status Code Style

Installation | Documentation | License

inkscraper is jobs scraper for linkedin that comes with restful api and full-text search.

Scraping linkedin jobs can be considered an infringement of linkedin TOS, use it carefully.

inkscraper currently supports:

  • Scrap Listing Page (Job listing, by default this will scrape https://www.linkedin.com/jobs/view-all)
  • Scrap Details Page (Job Details Page)
  • Restful API for jobs scraped from linkedin
  • Full-text search using built-in mongoose (of course built-in mongodb too)

Installation

  • git clone https://github.com/AdhityaRamadhanus/Linkedin-Scraper.git
  • cd Linkedin-Scraper
  • npm install
  • npm run start-apiserver
  • npm run start-scraper
  • set .env files (i'm using dotenv, see here https://www.npmjs.com/package/dotenv for documentation)
  • Example of .env
NODE_ENV=development

MONGOLAB_URI='mongodb://localhost:27017/linkedin-scraper'
APIDOC=true

Documentation

  • npm install -g apidoc
  • cd Linkedin-Scraper
  • npm run gen-doc
  • add APIDOC=true in .env
  • enjoy, documentation can be found in "/apidoc"

Known Problems

Like i said, scraping linkedin jobs can be considered an infringement of linkedin TOS so sometime you may get 999 status code even if you run this from your local computer

License

MIT © [Adhitya Ramadhanus]