linkedin-jobs-scraper icon indicating copy to clipboard operation
linkedin-jobs-scraper copied to clipboard

Description_duplicate

Open ebouse13 opened this issue 4 years ago • 6 comments

Hi, this is great. I'm trying to convert the results to a dataframe and I am getting a strange bug.

The date, link, title of the job all update fine but the description duplicates in the first and second rows. From the len(description) output, it seems to be happening when the queries are run. Any ideas as to why this is happening?

ebouse13 avatar Feb 04 '21 22:02 ebouse13

Hi there! What do you mean by the description duplicates in the first and second rows? Can you provide an example?

spinlud avatar Feb 05 '21 19:02 spinlud

file1.xlsx

Hi, can you see the attached? Basically all data is perfect until you get to the description in the second row - the description is the same as the first. Then there is a knock on impact for the other descriptions in that they match to company/job from previous row.

Let me know if you need further clarity?

Thank you :)

ebouse13 avatar Feb 05 '21 20:02 ebouse13

There are jobs on Linkedin posted several times, with the same description. Have you checked if that could be the case?

spinlud avatar Feb 12 '21 17:02 spinlud

Hi, yes I considered that too. However you can see that some of the descriptions contain the company name and the description/company name are out of sync by 1 for the rows after the duplication happens. Not sure why when the data in every other column (Company name, date posted, location) is all correct and then it's just the description has this issue. Must look into the code more to see where the loop happens.

ebouse13 avatar Feb 12 '21 18:02 ebouse13

Does it happen for any query or only for a particular one? Can you share the code of just the query you are doing?

spinlud avatar Feb 12 '21 22:02 spinlud

LinkedInQy.zip

I'm actually not sure - the attached is what I have been running.

ebouse13 avatar Feb 17 '21 07:02 ebouse13