scrape-linkedin-selenium icon indicating copy to clipboard operation
scrape-linkedin-selenium copied to clipboard

Job Crawler/Scraper/Parser

Open austinoboyle opened this issue 6 years ago • 2 comments

Scrape jobs by various filters:

  • Location
  • Company
  • Etc

First Use Case: Scrape all jobs in Kingston

Relevant URL https://www.linkedin.com/jobs/search/?keywords=&location=Kingston%2C%20Ontario%2C%20Canada&sortBy=DD

Process:

  1. Scrape Basic Info for All Jobs
  2. Based on Basic Scrape (job_id), run parallel scrape to get detailed info on all jobs

Basic Fields

  • title
  • job_id (links are '/jobs/view/ID')
  • location
  • company_name
  • company_id (links are '/company/ID')
  • company_image_link

Detailed Info

  • job_description
  • seniority_level
  • industries
  • employment_type
  • job_functions

austinoboyle avatar Apr 26 '18 20:04 austinoboyle

@austinoboyle I am working on a similar issue for my project, mostly founded which class I should parse and extract info, but i got struck when i try to download the source code of page, i got an utput like: ` r = requests.get('https://linkedin.com/jobs/') html_content = r.content

print(html_content)

print() soup = BeautifulSoup(html_content,'html.parser') print(soup) `

to which i got output: `

`

If you or anyone else can help me with how to get exact source code?

would be helpful for this issue also. I know its an older issue but i thought of why creating new one when similar issue is already here. if needed , i would make new one.

simarpreetsingh-019 avatar Sep 22 '20 21:09 simarpreetsingh-019

I did a pull request , Added Jobs and People in CompanyScraper If possible please test it on a temporary linked in account.

anilabhadatta avatar Aug 24 '21 22:08 anilabhadatta