linkedin-api icon indicating copy to clipboard operation
linkedin-api copied to clipboard

get_conversation & get_conversations

Open Tabish-Invo opened this issue 2 years ago • 8 comments

get_conversations and get_conversation both return only the 20 first results.

Tabish-Invo avatar Apr 24 '22 16:04 Tabish-Invo

Hola. Me puedes ayudar en como estas utilizando estos métodos para la extracción de esos 20 resultados, te lo agradecería.

jfmiraucn avatar May 06 '22 11:05 jfmiraucn

I have the same problem, are you using verison 1.0.0 or 2.0.0?

manuelrech avatar Dec 03 '22 12:12 manuelrech

@manuelrech I have tried this with both 1.0.0 and 2.0.0 and I can't figure it out. I also added

linkedin.py, line 928
res = self._fetch(f"/messaging/conversations?start=100", params=params)

which then bears this response as paging:

'paging': {'count': 20, 'start': 100, 'links': []}}

Unfortunately, it still contains the first 20 elements even though it claims to start at 100...

kenuxi avatar Dec 27 '22 10:12 kenuxi

i know, through API I managed to get the last 20 also. However, I thought about using selenium to get the conversation_urns by interacting with the webpage of linkedin. Here I also use a piece of javascript code to move down the sidebar and therefore not to let this process run forever, I used the month as a stopping criterion

from selenium import webdriver
from linkedin_api import Linkedin
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support import expected_conditions as EC


def linkedin_login(username, password):
    driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
    ##### LOGIN SESSION #####
    driver.get("https://www.linkedin.com")
    username_space = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'session_key')))
    password_space = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, 'session_password')))
    username_space.send_keys(username)
    password_space.send_keys(password)
    accedi_button = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CLASS_NAME, 'sign-in-form__submit-button')))
    accedi_button.click()

    return driver

def getting_conversation_urns(driver, stopping_month = 'Nov'):
    # to find what values can stopping month take, go into the message thread and see what is the corresponding label

    driver.get('https://www.linkedin.com/messaging/?')
    ##### RETRIEVING CONVERSATION URNS #####
    conversation_urns = []
    time_not_too_far = True
    while time_not_too_far:
        conversations = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, '/html/body/div[5]/div[3]/div[2]/div/div/main/div/section[1]/div[2]/ul/li/div/a')))
        for conversation in conversations:
            conversation_urn = conversation.get_attribute("href").split('/')[-2]
            if conversation_urn not in conversation_urns:
                conversation_urns.append(conversation_urn)
            
            time = WebDriverWait(conversation, 10).until(EC.visibility_of_element_located((By.XPATH, 'div[2]/div/div[1]/time'))).text
            if stopping_month in time: ##### conditioning on time 
                time_not_too_far = False
        driver.execute_script("return arguments[0].scrollIntoView();", conversations[-1])
    print('stopped when date was: ' + time)       

    return driver, conversation_urns  


driver = linkedin_login(YOUR_EMAIL, YOUR PASSWORD)
driver, conversation_urns = getting_conversation_urns(driver, 'Nov')

Now you can use the get_conversation(conversation_urn) method to get the conversation.

For me this tricks works, but you need to know a little of selenium, I hope this helps!

manuelrech avatar Dec 28 '22 19:12 manuelrech

In the end I found a solution that works for me in this issue: https://github.com/tomquirk/linkedin-api/issues/46

You can pass unix timestamp as the parameter created_before into the function. I was actually wondering if to create a PR to include this. I am guessing that this same logic applies for a couple of other endpoints, too.

kenuxi avatar Dec 29 '22 13:12 kenuxi

I would really appreciate that, or in case you could post how you modified the function definition to include the parameter

manuelrech avatar Dec 31 '22 11:12 manuelrech

I will have time to make a PR tomorrow. Did you manage to implement it locally?

kenuxi avatar Jan 12 '23 10:01 kenuxi

Yes i did as on issue #46, getting batches of 20 and starting the next from createdBefore = conversations['elements'][19]['events'][0]['createdAt'] of the previous batch, as @AchatY suggested in the issue. Thank you

manuelrech avatar Jan 12 '23 16:01 manuelrech