instaloader icon indicating copy to clipboard operation
instaloader copied to clipboard

Can only scrape 12 or so posts (first page?) before JSON Query to graphql/query 401 error

Open billbeans opened this issue 1 year ago • 69 comments
trafficstars

Describe the bug This tool stopped working completely for me a couple weeks ago, but since the last update it kind of works again, only it fails consistently around 12 or so posts. I even signed in and it still failed with the api check error.

To Reproduce Steps to reproduce the behavior: instaloader [any profile]

Expected behavior To be able to scrape profiles.

Error messages and tracebacks JSON Query to graphql/query: 401 Unauthorized - "fail" status, message "Please wait a few minutes before you try again." when accessing

When waiting a few minutes and trying again, it always fails

Instaloader version 4.12

Additional context I use a VPN which I know can mess things up, but before I was eventually able to scrape public profiles if I switched IPs enough. gallery-dl isn't working for me either, so I think there may have been another api change. I don't have "residential" proxies to test on, only "datacenter", so they may only block those now, but probably not if it lets me get 12 posts in? I'm also using Linux which judging from previous bug reports seemed to have less functionality

billbeans avatar Jul 12 '24 07:07 billbeans

I've had the same issue since two days ago. About four days ago I could scrape over 1k posts successfully and now it's just 12 no matter if I change locations with my VPN or use my regular IP address. I've waited over 6, 12 and 24 hours between attempts and results are still the same error as above.

spindelbotten avatar Jul 21 '24 14:07 spindelbotten

same here, i scrapped a lot ... and can't find a solution

Strykix avatar Jul 22 '24 11:07 Strykix

same here! even went down a dedicated Ip route. same problem.

J3-strip avatar Jul 24 '24 17:07 J3-strip

same problem

Tg00174 avatar Jul 25 '24 07:07 Tg00174

same problem

markshi9008 avatar Jul 25 '24 16:07 markshi9008

same problem

edisx avatar Jul 26 '24 04:07 edisx

same problem

AyoDev avatar Jul 27 '24 23:07 AyoDev

I am also running into the same problem. Seems like Instagram is going crazy with their rate limiting if they detect any automated behavior.

gueruex avatar Jul 28 '24 22:07 gueruex

on my side pictures are downloading, but at some points it stopped with this :

"File "C:\Python312\Lib\site-packages\instaloader\instaloadercontext.py", line 459, in get_json raise ConnectionException(error_string) from err instaloader.exceptions.ConnectionException: JSON Query to graphql/query: 401 Unauthorized - "fail" status, message "Please wait a few minutes before you try again." when accessing https://www.instagram.com/graphql/query?query_hash=......"

Strykix avatar Jul 29 '24 11:07 Strykix

i'm not sure the best way to get it added to the CLI, but i was able to get the code working again. i just manually added a HUGE sleep to the download. it appears instagram has a huge, huge rate limiter built in. when i just set my waittime to be a random number, around 20 minutes, i was able to get all posts. even from accounts with 2600 posts. yes it took a few days.

instaloader/instaloadercontext.py class RateController def wait_before_query waittime = random.randint(1050, 1150)

put a little random in there, just so each call varies a little bit.

i don't think the CLI for --rate-limit works. so i just hard coded this into my RateController directly.

though i'm running into another issue, which i think is unrelated. on some accounts, instaloader stops early and does not get the last 60 posts. i think it's a different issue.

androslee avatar Jul 31 '24 03:07 androslee

same problem here.

iamdanielwejs avatar Jul 31 '24 08:07 iamdanielwejs

Same here, 12 is the limit

DonBaronFactory avatar Jul 31 '24 11:07 DonBaronFactory

same problem

IamEdHardy avatar Jul 31 '24 13:07 IamEdHardy

i think something must have changed with instagram's api/response. i was getting full accounts last week. but yes, now mine is stopping after 12.

i'm not getting any errors, but mine just stops.

i wasnt saving any logs/json responses before, but i did just now. i found out a few things:

  • it looks like the only API that was working/responding without a login, was the "iphone_json" response
  • graphql just comes up empty. or at least this tool forms a nonworking url
  • in the iphone json response, i don't think it gets all of the posts of a user, and at the bottom it says "next page: false". so it just ends.

androslee avatar Aug 01 '24 05:08 androslee

i'm not sure the best way to get it added to the CLI, but i was able to get the code working again. i just manually added a HUGE sleep to the download. it appears instagram has a huge, huge rate limiter built in. when i just set my waittime to be a random number, around 20 minutes, i was able to get all posts. even from accounts with 2600 posts. yes it took a few days.

instaloader/instaloadercontext.py class RateController def wait_before_query waittime = random.randint(1050, 1150)

put a little random in there, just so each call varies a little bit.

i don't think the CLI for --rate-limit works. so i just hard coded this into my RateController directly.

@androslee This does not seem to work for me. Is that the only thing that you changed? And does it still work for you? If so, could you give more details?

Koenvanvlijmen avatar Aug 05 '24 09:08 Koenvanvlijmen

same here, I can only iterate and write out 12 posts from get_posts(). no error thrown out tho, it just stopped writing and exit the loop.

phuongcanopylab avatar Aug 07 '24 07:08 phuongcanopylab

same here - 12 posts the i get the error

drumstick90 avatar Aug 12 '24 10:08 drumstick90

Hello, Experiencing the same issue here. Was able to scrape all posts for one user a couple of weeks ago. Instaloader CLI now just downloads the 12 latest posts and quit (no errors). Logging in does not help.

eminaatnexro avatar Aug 12 '24 12:08 eminaatnexro

having same issue

achyu-dev avatar Aug 15 '24 06:08 achyu-dev

same here just 10 stories or no stories at all. no story is downloaded from instaloader https://www.instagram.com/xresenthhh_____/ - gallery-dl download all stories without problem - but you loosing the filename convension of instaloaaer. I think both must use same filename convension. Also, gallery-dl is more aggressive i think.

estatistics avatar Aug 15 '24 11:08 estatistics

same issue, doing pretty slow request by changing: def do_sleep(self): if self.sleep: print("sleep was called") time.sleep(min(random.expovariate(1/3), 20))

sgtpepperaut avatar Aug 15 '24 12:08 sgtpepperaut

It appears that 4.13 latest build fixed that problem.

estatistics avatar Aug 15 '24 18:08 estatistics

yes, sorry, edited

estatistics avatar Aug 15 '24 18:08 estatistics

Works for me as well :partying_face:

yonas avatar Aug 15 '24 18:08 yonas

I still get the same error in google colab

My code:


import instaloader
L = instaloader.Instaloader()
posts = instaloader.Profile.from_username(L.context, "_target_").get_posts()
for post in posts:
    L.download_post(post, "_target_")

My output:

JSON Query to graphql/query: 401 Unauthorized - "fail" status, message "Please wait a few minutes before you try again." when accessing https://www.instagram.com/graphql/query [retrying; skip with ^C]

drumstick90 avatar Aug 15 '24 19:08 drumstick90

@drumstick90 I am facing the same issue

achyu-dev avatar Aug 15 '24 19:08 achyu-dev

Works when launched from ipynb in visual studio code !

drumstick90 avatar Aug 15 '24 20:08 drumstick90

It's works on local machine fine. But not at hosting JSON Query to graphql/query: 401 Unauthorized - "fail" status, message

JellyTyan avatar Aug 20 '24 10:08 JellyTyan

seems its still happening with 4.13

PastaAdventures avatar Aug 28 '24 06:08 PastaAdventures

If you're talking about the 401 error (like in the first post), especially when not logged in, that's caused by Instagram. They strongly limit anonymous scraping, and you'll probably see that even with a browser you cannot see more than the first 12 posts - and sometimes nothing at all.

If that's not the case (you're logged in, or it downloads 12 and stops with no error, etc) then please describe the problem clearly.

ekalin avatar Aug 28 '24 10:08 ekalin