bot icon indicating copy to clipboard operation
bot copied to clipboard

Scrape followers/following usernames for a given user

Open klimbot opened this issue 2 years ago • 4 comments

What would you like to be added:

Want to be able to do some instagram scraping - collecting data and metrics for given usernames/hashtags/etc

  1. I want to be able to scrape all the followers/following usernames for a given user
  2. For a given number of posts for a user I'd like to scrape the likers/commenters usernames
  3. I want to be able to scrape the usernames of those who have tagged the given user

eg: https://www.instagram.com/nba/

Scraping followers usernames would yield (at a minimum) something like:

zachgoblinn
avalphavision
tkntasdamla
...

Why is this needed:

Will facilitate instagram reporting features

klimbot avatar Jul 02 '22 10:07 klimbot

Hello Klims, sorry for late interaction to this issue! I saw the PR and before looking at it I want to ask in which way that's different from scrape mode already present in the bot? immagine

Thanks :)

mastrolube avatar Jul 05 '22 06:07 mastrolube

Hey @mastrolube This is actually a bit of a departure from how the bot is being used by now - not sure if it actually makes sense to be integrated as a feature, but I'll continue to develop it as I need it and we can see how it shapes up.

Yes it is different - the intention of the feature(s) is to add scraping that would be used for gathering user metrics. There is no intention to use the scraped information for anything other than reporting/profiling. I'm working here on the feature if you wanted to have a look/give it a go.

klimbot avatar Jul 05 '22 08:07 klimbot

In order to scrape thousands of followers from an influencer probably need to not do it from a single account over a single session. Would be ideal to be able to spread it out over multiple accounts, but GramAddict doesn't support this feature as yet.

Currently using a single config file and changing username dynamically, and then copying the session-modified files between each session like this:

./change_username.sh user2

#!/bin/bash

filename="/Users/code/auto-insta-gramaddict/accounts/master/config.yml"
sed -i'.bak' 's/username:.*/username: '"$1"'/' $filename

Orchestration file auto_run.sh that copies config between sessions

...
cp accounts/user1/history_filters_users.json accounts/user2/history_filters_users.json
cp accounts/user1/interacted_users.json accounts/user2/interacted_users.json
cp accounts/user1/sessions.json accounts/user2/sessions.json
...

Issue: When sharing the config.yml / filters.yml / history_filters_users.json / interacted_users.json / sessions.json the interacted user data model breaks down - if any influencers we are trying to scrape have the same followers then they will be ignored as they have been interacted with before.

Creating a new function to address the above that not only looks to see if a user has been interacted with but also matches the interaction target

klimbot avatar Jul 08 '22 02:07 klimbot

Good morning Klim, I am also replying to you here so that people can read and understand.

This bot was created as a personal pseudo growth tool. Scraping was added later at the insistence of some users and is individual, inefficient, stupid. As you also understood well if I make multiple accounts go scraping, they might waste time interacting with the same users. So it is very inefficient to scrap with this bot, especially in the current state. When it comes to scraping I always recommend using other tools (for example instaloader, which are much more performant (x1000000000000000000000000 times). Once you have the list you can manage it with this bot. It really takes you so much less time, and you don't get the phone ID flagged (assuming that actually happens).

The management of this data will soon be replaced with a database which will allow more flexibility and unlock potentials that are currently hard to achieve (like this for example).

mastrolube avatar Jul 08 '22 06:07 mastrolube