scraper-instagram
scraper-instagram copied to clipboard
Scrape data from Instagram without applying for the authenticated API 🎯

scraper-instagram
Scrape data from Instagram without applying for the authenticated API.
Getting started
Prerequisites
- NodeJS
- NPM or Yarn
Install
From npm
npm i scraper-instagram --save
or
yarn add scraper-instagram
Basic usage
const Insta = require('scraper-instagram');
const InstaClient = new Insta();
InstaClient.getHashtag("javascript")
.then((hashtag) => console.log(hashtag))
.catch((err) => console.error(err));
...
{
shortcode: 'CbGxIdAXxA',
caption: 'Lorem ipsum #javascript',
comments: 66,
likes: 1090,
thumbnail: 'https://scontent-dus1-1.cdninstagram.com/v/123',
timestamp: 1647290186
},
...
Authentication
Authentication allows you to access private profile as long as you follow them.
Importing your session ID
- Go to instagram.com
- Login (if not already logged in)
- Open development tools (
Ctrl+Shift+I) - Get the
sessionidcookie value- For chromium-based browsers :
applicationtab - For firefox-based browsers :
storagetab
- For chromium-based browsers :
Code
InstaClient.authBySessionId(yourSessionId)
.then(account => console.log(account))
.catch(err => console.error(err));
If authentication is successfull, you'll get the form data from accounts/edit :
{
"first_name": "",
"last_name": "",
"email": "",
"is_email_confirmed": true,
"is_phone_confirmed": true,
"username": "",
"phone_number": "",
"gender": 1,
"birthday": null,
"biography": "",
"external_url": "",
"chaining_enabled": true,
"presence_disabled": false,
"business_account": false,
"usertag_review_enabled": false
}
If your session ID is invalid, you'll get the 401 error.
Username/password authentication may be supported in the future.
Get
These methods allows you to get specific elements from Instagram while you know exactly what you're looking for.
Errors handling
get may return errors in the two following cases.
- Request error : failed to get data from Instagram (HTTP code)
- Parsing error : failed to parse data returned by Instagram (
406) - No content : nothing to parse (
204) - Authentication required : session ID required to access data (
401) - Too many requests : rate limit exceeded (
429) - Conflict : automation detected, password reset required (
409)
Get profile by username
InstaClient.getProfile(username)
.then(profile => console.log(profile))
.catch(err => console.error(err));
Result
idstring - Instagram identifier, only used for storiesnamestring - public full namepicurl - public profile picturebiostring - public biographywebsiteurl - public website
more info about bio & websiteprivateboolean - account private stateaccessboolean - access to the profile's feed
In order to have access to a private account's feed, you must have sent him a follow request that he accepted.verifiedboolean - account verified statefollowersinteger - number of users following this profilefollowinginteger - number of users this profile followspostsinteger - number of posts this profile publishedlastPostsarray of posts - last posts
This property is empty ([]) when the profile doesn't have any post butnullifaccessisfalse(denied).linkurl - link to the profile's pagebusinessstring - business category (when applicable and profile unblocked)userobject - user relevant properties (while authenticated) :mutualFollowersarray of usernames - people following you and this profileblockingboolean - you blocked this profileblockedboolean - this profile blocked you (only available property inuserwhiletrue)requestingboolean - you sent a follow request to this profile (if private)requestedboolean - this profile sent you a follow request (if yours is private)followingboolean - you're following this profilefollowedboolean - this profile follows you
Get profile story (requires authentication)
Using profile ID
InstaClient.getProfileStoryById(id)
.then(profile => console.log(profile))
.catch(err => console.error(err));
Using profile username (will automatically request profile ID)
InstaClient.getProfileStory(username)
.then(profile => console.log(profile))
.catch(err => console.error(err));
Result
unreadboolean - profile story is unreadauthorobject - a subset of profileusernamepic
userobject - user relevant propertiesrequestingfollowing
itemsarray of stories - profile storiesurlstring - link to original story file (jpg,mp4, ...)typestring - story type :photoorvideotimestampepochexpirationTimestampepoch
Those methods will return null when a profile has no story.
Note : calling this method will not mark the story as read.
Get hashtag
InstaClient.getHashtag(hashtag)
.then(hashtag => console.log(hashtag))
.catch(err => console.error(err));
Result
picurl - hashtag profile pic (can't find out how it is chosen)postsinteger - number of posts containing this hashtagfeaturedPostsarray of posts - featured posts published with this hashtaglastPostsarray of posts - last posts published with this hashtag
more info about hashtag postslinkurl - link to the hashtag's pageuserobject - user relevant properties (while authenticated) :followingboolean - you subscribed to this hashtag (receiving posts in your personal feed)
Get location by ID
Unfortunately, using IDs is currently the only way to get a location, at least for now.
InstaClient.getLocation(id)
.then(location => console.log(location))
.catch(err => console.error(err));
Result
picurl - location profile picpostsinteger - posts published from that locationaddressobjectstreetstringzipCodestringcitystringlatitudefloatlongitudefloat
websiteurl - place's websitephonestring - place's contact phone numberfeaturedPostsarray of posts - featured posts published from this locationlastPostsarray of posts - last posts published from this locationlinkurl - link to this location's page
Array of posts
This is a subset of a real post, containing the following properties :
shortcodestring - post identifiercaptionstring - post descriptioncommentsinteger - number of commentslikesinteger - number of likesthumbnailurl - post thumbnail
Always static image wether it's a photo or a video post, lower quality.
Get post by shortcode
The shortcode is the post's identifier : the link to a post is instagram.com/p/shortcode.
InstaClient.getPost(shortcode)
.then(post => console.log(post))
.catch(err => console.error(err));
Result
authorobject - a subset of a profile's properties.usernamestringnamestringpicurlverifiedbooleanlinkurl
locationnamestringcitystring
contentsarray of poststypestring - post type :photoorvideourlstring - link to original post file (jpg,mp4, ...)- if
typeisvideo:thumbnailstring - link to thumbnailviewsinteger - number of views
taggedarray of usernames - people tagged in post contentslikesinteger - number of likescaptionstring - post descriptionhashtagsarray of hashtags - hashtags mentioned in post descriptionmentionsarray of usernames - people mentioned in post descriptioneditedboolean - caption editedcommentsarray of objects (Max 40)userstring - comment author's usernamecontentstring - comment contenttimestampepochhashtagsarray of hashtagsmentionsarray of usernameslikesinteger
commentCountintegertimestampepochlinkstring - link to the post
Paginated getters (require authentication)
Paginated getters allows bulk data downloads.
Params :
maxCountinteger - max number of items to returnpageIdstring (optional) - page navigation identifier
Result : array + nextPageId property
Sample :
(async () => {
const page0 = await somePaginatedGetter(someId, 50);
const page1 = await somePaginatedGetter(someId, 50, page0.nextPageId);
const page2 = await somePaginatedGetter(someId, 50, page1.nextPageId);
})();
The pageId/nextPageId property may contain a string of digits, a base64 string, or a JSON string, but always must be leaved untouched.
Get profile posts
Result in array : full post object
Using profile ID
InstaClient.getProfilePostsById(profileId, maxCount, pageId)
.then(posts => console.log(posts))
.catch(err => console.error(err));
Using profile username (will automatically request profile ID)
InstaClient.getProfilePosts(profileUsername, maxCount, pageId)
.then(posts => console.log(posts))
.catch(err => console.error(err));
Get post comments
InstaClient.getPostComments(shortcode, maxCount, pageId)
.then(posts => console.log(posts))
.catch(err => console.error(err));
Result in array : comment object
Get hashtag posts
InstaClient.getHashtagPosts(hashtag, maxCount, pageId)
.then(posts => console.log(posts))
.catch(err => console.error(err));
Result in array : partial post object
Get location posts
InstaClient.getLocationPostsById(locationId, maxCount, pageId)
.then(posts => console.log(posts))
.catch(err => console.error(err));
Result in array : partial post object
Search
Search profile
InstaClient.searchProfile(query)
.then(profiles => console.log(profiles))
.catch(err => console.error(err));
Result in array : a subset of profile.
usernamenamepicprivateverifiedfollowersuserfollowing
Search hashtag
InstaClient.searchHashtag(hashtag)
.then(hashtags => console.log(hashtags))
.catch(err => console.error(err));
Result in array : a subset of hashtag.
nameposts
Search location
InstaClient.searchLocation(location)
.then(locations => console.log(locations))
.catch(err => console.error(err));
Result in array : a subset of location.
idnameaddressstreetcitylatitudelongitude
Subscribe to posts
optionsobject (optional)intervalinteger (optional) - time in seconds between requests. Default : 30lastPostShortcodestring (optional) - shortcode from which to begin if not the next one to be published.fullPostsboolean (optional) - fetch full post data, additional request required
From user
InstaClient.subscribeUserPosts(username, (post, err) => {
if(post)
console.log(post.shortcode);
else
console.error(err);
}, {
interval,
lastPostShortcode,
fullPosts
});
From hashtag
InstaClient.subscribeHashtagPosts(hashtag, (post, err) => {
if(post)
console.log(post.shortcode);
else
console.error(err);
}, {
interval,
lastPostShortcode,
fullPosts
});
Account requests (user-relevant methods)
Get account notifications
InstaClient.getAccountNotifications()
.then(notifications => console.log(notifications))
.catch(err => console.error(err));
Result in array : notification
idstring - Notification identifiertimestampepochtypestring - Notification type :like,mention,comment,followpostshortcodethumbnail
byusernamenamepic
contentstring - Comment content (when applicable)
Subscribe to account notifications
optionsobject (optional)intervalinteger (optional) - time in seconds between requests. Default : 30lastNotificationIdstring (optional) - Notification ID
InstaClient.subscribeAccountNotifications((post, err) => {
if(post)
console.log(post.shortcode);
else
console.error(err);
}, {
interval,
lastNotificationId
});
Get account stories
InstaClient.getAccountStories()
.then(stories => console.log(stories))
.catch(err => console.error(err));
Result in array : inbox-like
unreadauthorobject - a subset of a profile's properties.idusernamepic
userobject - user relevant propertiesrequestingfollowing
Test
git clone https://github.com/ezzcodeezzlife/ig-scraper.gityarn installornpm installyarn testornpm run test
Optional environment variables for more complete testing :
SESSION_ID: a session ID for authentication test and authenticated testsPUBLIC_PROFILE: a public profile to accessPRIVATE_PROFILE: a private profile to accessSTORY_PROFILE_ID: a profile ID with a story to readSTORY_PROFILE_USERNAME: a profile username with a story to readHASHTAG(default value :cat) : a hashtag to fetchLOCATION_ID(default value :6889842aka. Paris) : a location to fetchPOST: a post to fetchSEARCH_PROFILE: a profile to search forSEARCH_HASHTAG(default value :cats) : a hashtag to search forSEARCH_LOCATION(default value :Paris) : a location to search for
Methods not covered by tests :
subscribeUserPostssubscribeHashtagPostssubscribeAccountNotifications