ragamints
ragamints copied to clipboard
Bypass Instagram API Restrictions
While it appears the Instagram API is now useless, I would still like to recover all my metadata (caption, tags, GPS loc) and a backup of my photos.
It looks like it could be possible to bypass the API restrictions by... not using the API at all. All the info ragamints was retrieving through the API is buried in the user-facing Instagram web app itself. It would require rewriting this tool as a web scraper. The GPS coordinates can be found by scraping the LOCATION_ID embedded in the link attached to the location name in the photo page, loading the corresponding https://www.instagram.com/explore/locations/LOCATION_ID/ page, and scraping the latitude and longitude from:
<meta property="place:location:latitude" content="42.6598" />
<meta property="place:location:longitude" content="-73.7813" />
Obviously one would have to be careful about it, as it is likely against the Instagram TOS. It would also be slower, to remain "undetected". On the other hand, it would no longer require any authentication token.
It seems Instagram scrapers are popping up on Github and could be leveraged. Otherwise check:
- Web Scraping Search on egghead.io
- ~~lapwinglabs/x-ray: The next web scraper~~ (no trivial Ajax support)
- ~~rchipka/node-osmosis: Web scraper for NodeJS~~ (no obvious Ajax support, lack of doc)
- ~~casperjs/casperjs: Navigation scripting and testing utility for PhantomJS and SlimerJS~~
- segmentio/nightmare: A high-level browser automation library, also examples, rosshinkley/nightmare-load-filter, kyungw00k/nightmare-webrequest-addon, phulas/x-ray-nightmare
UPDATE: segmentio/nightmare is doing the trick.
The tentative new interface could be:
Download a single pic:
ragamints download https://www.instagram.com/p/BJEFDFDhkzy
Download multiple pics:
ragamints download https://www.instagram.com/p/BJEFDFDhkzy https://www.instagram.com/p/BIqp19VBOK_
Download an interval of pics from the same user (e.g. all pics between two pics, included):
ragamints download https://www.instagram.com/p/BJEFDFDhkz...https://www.instagram.com/p/BIqp19VBOK_
Download 10 most recent pics from user:
ragamints download https://www.instagram.com/sebastienbarre
ragamints download sebastienbarre
Download 3 most recent pics from user:
ragamints download https://www.instagram.com/sebastienbarre:3
ragamints download sebastienbarre:3
All of the above can be combined together.
The following options would be deprecated:
-u, --user-id Instagram user ID (or user name) [string]
-c, --count Maximum count of medias to download
-m, --min-id Only medias posted later than this media id/url (included) [string]
-n, --max-id Only medias posted earlier than this media id/url (excluded) [string]
-s, --sequential Process sequentially (slower) [boolean] [default: false]
-r, --resolution Resolution(s) to download, e.g. high_resolution,standard_resolution,low_resolution,thumbnail [string]
-t, --access-token Instagram Access Token [string]
The following options would still apply:
-i, --include-videos Include videos (skipped by default) [boolean] [default: false]
-d, --dest Destination directory [string] [default: "./"]
-a, --always-download Always download, even if media is saved already [boolean] [default: false]
-j, --json Save media json object (accepts keys to pluck) [default: false]
-l, --clear-cache Clear the cache [boolean] [default: false]
-v, --verbose Output more info [boolean] [default: false]
-q, --quiet Output less info [boolean] [default: false]
--config Load config file [default: "~/.ragamints.json"]
-h, --help Show help [boolean]