scrape-linkedin-selenium
scrape-linkedin-selenium copied to clipboard
Job Crawler/Scraper/Parser
Scrape jobs by various filters:
- Location
- Company
- Etc
First Use Case: Scrape all jobs in Kingston
Relevant URL https://www.linkedin.com/jobs/search/?keywords=&location=Kingston%2C%20Ontario%2C%20Canada&sortBy=DD
Process:
- Scrape Basic Info for All Jobs
- Based on Basic Scrape (job_id), run parallel scrape to get detailed info on all jobs
Basic Fields
- title
- job_id (links are '/jobs/view/ID')
- location
- company_name
- company_id (links are '/company/ID')
- company_image_link
Detailed Info
- job_description
- seniority_level
- industries
- employment_type
- job_functions
@austinoboyle I am working on a similar issue for my project, mostly founded which class I should parse and extract info, but i got struck when i try to download the source code of page, i got an utput like: ` r = requests.get('https://linkedin.com/jobs/') html_content = r.content
print(html_content)
print() soup = BeautifulSoup(html_content,'html.parser') print(soup) `
to which i got output: `
`If you or anyone else can help me with how to get exact source code?
would be helpful for this issue also. I know its an older issue but i thought of why creating new one when similar issue is already here. if needed , i would make new one.
I did a pull request , Added Jobs and People in CompanyScraper If possible please test it on a temporary linked in account.