instaloader
instaloader copied to clipboard
Can only scrape 12 or so posts (first page?) before JSON Query to graphql/query 401 error
Describe the bug This tool stopped working completely for me a couple weeks ago, but since the last update it kind of works again, only it fails consistently around 12 or so posts. I even signed in and it still failed with the api check error.
To Reproduce Steps to reproduce the behavior: instaloader [any profile]
Expected behavior To be able to scrape profiles.
Error messages and tracebacks
JSON Query to graphql/query: 401 Unauthorized - "fail" status, message "Please wait a few minutes before you try again." when accessing
When waiting a few minutes and trying again, it always fails
Instaloader version 4.12
Additional context I use a VPN which I know can mess things up, but before I was eventually able to scrape public profiles if I switched IPs enough. gallery-dl isn't working for me either, so I think there may have been another api change. I don't have "residential" proxies to test on, only "datacenter", so they may only block those now, but probably not if it lets me get 12 posts in? I'm also using Linux which judging from previous bug reports seemed to have less functionality
I've had the same issue since two days ago. About four days ago I could scrape over 1k posts successfully and now it's just 12 no matter if I change locations with my VPN or use my regular IP address. I've waited over 6, 12 and 24 hours between attempts and results are still the same error as above.
same here, i scrapped a lot ... and can't find a solution
same here! even went down a dedicated Ip route. same problem.
same problem
same problem
same problem
same problem
I am also running into the same problem. Seems like Instagram is going crazy with their rate limiting if they detect any automated behavior.
on my side pictures are downloading, but at some points it stopped with this :
"File "C:\Python312\Lib\site-packages\instaloader\instaloadercontext.py", line 459, in get_json raise ConnectionException(error_string) from err instaloader.exceptions.ConnectionException: JSON Query to graphql/query: 401 Unauthorized - "fail" status, message "Please wait a few minutes before you try again." when accessing https://www.instagram.com/graphql/query?query_hash=......"
i'm not sure the best way to get it added to the CLI, but i was able to get the code working again. i just manually added a HUGE sleep to the download. it appears instagram has a huge, huge rate limiter built in. when i just set my waittime to be a random number, around 20 minutes, i was able to get all posts. even from accounts with 2600 posts. yes it took a few days.
instaloader/instaloadercontext.py class RateController def wait_before_query waittime = random.randint(1050, 1150)
put a little random in there, just so each call varies a little bit.
i don't think the CLI for --rate-limit works. so i just hard coded this into my RateController directly.
though i'm running into another issue, which i think is unrelated. on some accounts, instaloader stops early and does not get the last 60 posts. i think it's a different issue.
same problem here.
Same here, 12 is the limit
same problem
i think something must have changed with instagram's api/response. i was getting full accounts last week. but yes, now mine is stopping after 12.
i'm not getting any errors, but mine just stops.
i wasnt saving any logs/json responses before, but i did just now. i found out a few things:
- it looks like the only API that was working/responding without a login, was the "iphone_json" response
- graphql just comes up empty. or at least this tool forms a nonworking url
- in the iphone json response, i don't think it gets all of the posts of a user, and at the bottom it says "next page: false". so it just ends.
i'm not sure the best way to get it added to the CLI, but i was able to get the code working again. i just manually added a HUGE sleep to the download. it appears instagram has a huge, huge rate limiter built in. when i just set my waittime to be a random number, around 20 minutes, i was able to get all posts. even from accounts with 2600 posts. yes it took a few days.
instaloader/instaloadercontext.py class RateController def wait_before_query waittime = random.randint(1050, 1150)
put a little random in there, just so each call varies a little bit.
i don't think the CLI for --rate-limit works. so i just hard coded this into my RateController directly.
@androslee This does not seem to work for me. Is that the only thing that you changed? And does it still work for you? If so, could you give more details?
same here, I can only iterate and write out 12 posts from get_posts(). no error thrown out tho, it just stopped writing and exit the loop.
same here - 12 posts the i get the error
Hello, Experiencing the same issue here. Was able to scrape all posts for one user a couple of weeks ago. Instaloader CLI now just downloads the 12 latest posts and quit (no errors). Logging in does not help.
having same issue
same here just 10 stories or no stories at all. no story is downloaded from instaloader https://www.instagram.com/xresenthhh_____/ - gallery-dl download all stories without problem - but you loosing the filename convension of instaloaaer. I think both must use same filename convension. Also, gallery-dl is more aggressive i think.
same issue, doing pretty slow request by changing: def do_sleep(self): if self.sleep: print("sleep was called") time.sleep(min(random.expovariate(1/3), 20))
It appears that 4.13 latest build fixed that problem.
yes, sorry, edited
Works for me as well :partying_face:
I still get the same error in google colab
My code:
import instaloader
L = instaloader.Instaloader()
posts = instaloader.Profile.from_username(L.context, "_target_").get_posts()
for post in posts:
L.download_post(post, "_target_")
My output:
JSON Query to graphql/query: 401 Unauthorized - "fail" status, message "Please wait a few minutes before you try again." when accessing https://www.instagram.com/graphql/query [retrying; skip with ^C]
@drumstick90 I am facing the same issue
Works when launched from ipynb in visual studio code !
It's works on local machine fine. But not at hosting
JSON Query to graphql/query: 401 Unauthorized - "fail" status, message
seems its still happening with 4.13
If you're talking about the 401 error (like in the first post), especially when not logged in, that's caused by Instagram. They strongly limit anonymous scraping, and you'll probably see that even with a browser you cannot see more than the first 12 posts - and sometimes nothing at all.
If that's not the case (you're logged in, or it downloads 12 and stops with no error, etc) then please describe the problem clearly.