instascrape icon indicating copy to clipboard operation
instascrape copied to clipboard

get_recent_posts() raises MissingCookieWarning but we can't pass a valid cookie

Open marco97pa opened this issue 3 years ago • 9 comments

Describe the bug The get_recent_posts() method raises MissingCookieWarning, but we can't pass a valid cookie header to avoid that

To Reproduce

from instascrape import *

instagram_sessionid = "xxx"
headers = {"user-agent": "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Mobile Safari/537.36 Edg/87.0.664.57",
"cookie": f"sessionid={instagram_sessionid};"}
profile = Profile('https://www.instagram.com/google/')
profile.scrape(headers=headers)
print(profile.posts)
recents = profile.get_recent_posts() #We should pass a cookie here

The code is executed correctly but we get a MissingCookiesWarning: Request header does not contain cookies! It's recommended you pass at least a valid sessionid otherwise Instagram will likely redirect you to their login page. warning

If I try to pass a header cookie:

from instascrape import *

instagram_sessionid = "xxx"
headers = {"user-agent": "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Mobile Safari/537.36 Edg/87.0.664.57",
"cookie": f"sessionid={instagram_sessionid};"}
profile = Profile('https://www.instagram.com/google/')
profile.scrape(headers=headers)
print(profile.posts)
recents = profile.get_recent_posts(headers=headers) #This time I try to pass an header cookie

I get a TypeError: get_recent_posts() got an unexpected keyword argument 'headers'

Expected behavior We should be able to pass a valid cookie to avoid the warning or the warning should not be triggered altogether.

marco97pa avatar Feb 28 '21 10:02 marco97pa

Have the same issue!

vordemann avatar Mar 04 '21 09:03 vordemann

I fixed it by passing cookies to Selenium before going to the profile. I do this by exporting the cookies from instagram with the chrome extension Cookie-Editor. And then just copy paste it to cookies.json

url = f"https://www.instagram.com/{handle}/"

driver.get(url)  # Needed to fake a login
# Fake login with Cookies
with open("./cookies.json", "r", newline="") as data:  # Open cookies.json
    cookies = json.load(data)
    for cookie in cookies:  # Add cookies to driver
        cookie.pop("sameSite")  # Selenium breaks with sameSite
        driver.add_cookie(cookie)  # Add our authorized cookies

ig_profile = Profile(url)  # Set IG profile
ig_profile.url = url
ig_profile.scrape(headers=headers)  # Scrape IG profile

Xerrion avatar Mar 09 '21 16:03 Xerrion

Any way around it so far without selenium?

asauce0972 avatar Mar 12 '21 15:03 asauce0972

I get the same error and posted about it at https://github.com/chris-greening/instascrape/issues/89#issuecomment-801495835

yeamusic21 avatar Mar 17 '21 23:03 yeamusic21

I fixed it by passing cookies to Selenium before going to the profile. I do this by exporting the cookies from instagram with the chrome extension Cookie-Editor. And then just copy paste it to cookies.json

url = f"https://www.instagram.com/{handle}/"

driver.get(url)  # Needed to fake a login
# Fake login with Cookies
with open("./cookies.json", "r", newline="") as data:  # Open cookies.json
    cookies = json.load(data)
    for cookie in cookies:  # Add cookies to driver
        cookie.pop("sameSite")  # Selenium breaks with sameSite
        driver.add_cookie(cookie)  # Add our authorized cookies

ig_profile = Profile(url)  # Set IG profile
ig_profile.url = url
ig_profile.scrape(headers=headers)  # Scrape IG profile

@Xerrion

I spent a lot of time trying this. Not sure what cookie.pop("sameSite") is doing since I don't see any sameSite keys if I call print(driver.get_cookies()), so I skipped all that and just ran driver.add_cookie({'name':'sessionid','value':os.environ['INSTAGRAM_SESSIONID']}) which just resulted in the same MissingCookiesWarning. :-(

UPDATE:

So I'm trying this again. I understand your comment now for sameSite. I'm still getting the MissingCookiesWarning though. If you're updating the driver, but not passing it to the scrape method, how is updating the driver impacting instascrape if you don't pass it to instascrape???

yeamusic21 avatar Mar 19 '21 20:03 yeamusic21

I've been combing through the code. Looks like you have to pass your driver to the scrape method as well. I mention it here https://github.com/chris-greening/instascrape/issues/89#issuecomment-805394041 but I'm still getting the same error even with the driver passed to scrape, which is very weird if you read the code.

yeamusic21 avatar Mar 24 '21 01:03 yeamusic21

Just noting that this issue has made it pretty much impossible for me to use instascrape for my use case. Due to this issue and https://github.com/chris-greening/instascrape/issues/89 at this point I've abandoned instacrape.

yeamusic21 avatar Mar 30 '21 15:03 yeamusic21

get_recent_post() always returns 24 post no matter the amount, can I bypass that? like get all the post?

nullsaint avatar May 01 '21 18:05 nullsaint

I tried the same thing as he did (adding the cookie manually) but still I'm getting the warning. Like what am I doing wrong? Here's the code I am using:

SESSION_ID = 'my session id'
url = f"https://www.instagram.com/discordbot98/"
webdriver.get(url)
time.sleep(10)
webdriver.add_cookie({'name': 'sessionid', 'value': SESSION_ID})

yugkha3 avatar Jan 23 '22 11:01 yugkha3