linkedin_scraper
linkedin_scraper copied to clipboard
company.py: update class name to artdeco-card.p5.mb4
The current version of company.py fails to parse some of the information on LinkedIn company pages. It was failing to find a class called artdeco-card.p4.mb3
.
The code that looks for artdeco-card.p4.mb3
is from PR #112. In PR #112, @Alex-Bujorianu says:
Change the assignment of grid to find the element by the necessary classname (artdeco-card.p4.mb3) instead of the section number. This fixes the issue where a lot of fields were empty/null. But this isn’t very resilient to any changes LinkedIn might make to their classnames. I am not sure if it is better to use this approach or the numbering approach.
Now, the classname on the LinkedIn company pages have replaced the classname artdeco-card.p4.mb3
with artdeco-card.p5.mb4
.
In this PR, I update the classname to artdeco-card.p5.mb4
. I found that after making this change, company.py works properly again.
Here is an example showing how the LinkedIn company page has a the updated classname:
I am actually new to this. But I guess we aren't using IDs as they are autogenerated?
@forresti
class was updated one more time
grid = driver.find_element(By.CLASS_NAME,
"artdeco-card.org-page-details-module__card-spacing.artdeco-card.org-about-module__margin-bottom")