instagram-php-scraper icon indicating copy to clipboard operation
instagram-php-scraper copied to clipboard

Array empty

Open bastienuh opened this issue 2 years ago • 6 comments

Hi,

When I make a call on a getAccount, eveything seems to be OK.

Code :

$instagram  = Instagram::withCredentials(new \GuzzleHttp\Client(), 'USERNAME', 'PASSWORD', new Psr16Adapter('Files'));
$instagram->login();
$instagram->saveSession();

$return = $instagram->getAccount('cristiano');

dd($return);

Result:

InstagramScraper\Model\Account {[#407 ▼]()
  #id: "173560420"
  #fbid: "17841401692602711"
  #username: "cristiano"
  #fullName: "Cristiano Ronaldo"
  (...)
}


But know, if I switch on getMediasByTag :

Code :

$instagram  = Instagram::withCredentials(new \GuzzleHttp\Client(), 'USERNAME', 'PASSWORD', new Psr16Adapter('Files'));
$instagram->login();
$instagram->saveSession();

$return = $instagram->getMediasByTag('love', 20);

dd($return);

Result : (an empty array)

[ ]

But if I go to this : https://www.instagram.com/explore/tags/love/, I can see a lot of results. The same happens when I use getCurrentTopMediasByTagName() : the result is an empty array.

Do you know why? I don't understand those differences.

Thanks !

bastienuh avatar Mar 17 '22 10:03 bastienuh

I have the same problem. Is this still viable solution for instagram scraper? Or this package is abandoned?

xbelmondo avatar Apr 05 '22 14:04 xbelmondo

It seems that instagram response has complete changed structure...

xbelmondo avatar Apr 05 '22 20:04 xbelmondo

Yes. I totally confirm and I created a fork here: https://github.com/CopernicPointCo/instagram-php-scraper With a new function named my_getCurrentTopMediasByTagName (line 1388).

bastienuh avatar Apr 06 '22 06:04 bastienuh

Right now, I've noticed that the JSON response is different when you're logged in and when you're not. We therefore need to parse the data, respectively according to the status of the login.

xbelmondo avatar Apr 06 '22 14:04 xbelmondo

This is a messy hack, but it did the job for me:

Code:

`public function recMediasByTag($tag, $count = 12, $maxId = "", $minTimestamp = null){

    $index = 0;

    $medias = [];

    $mediaIds = [];

    $hasNextPage = true;

    while ($index < $count && $hasNextPage) {

        $response = Request::get(Endpoints::getMediasJsonByTagLink($tag, $maxId),
            $this->generateHeaders($this->userSession));

        if ($response->code === static::HTTP_NOT_FOUND) {
            throw new InstagramNotFoundException('This tag does not exists or it has been hidden by Instagram');
        }

        if ($response->code !== static::HTTP_OK) {
            throw new InstagramException('Response code is ' . $response->code . '. Body: ' . static::getErrorBody($response->body) . ' Something went wrong. Please report issue.', $response->code);
        }

        $this->parseCookies($response->headers);

        $arr = $this->decodeRawBodyToJson($response->raw_body);

        if (!is_array($arr)) {
            throw new InstagramException('Response decoding failed. Returned data corrupted or this library outdated. Please report issue');
        }
        $rootKey = array_key_exists('graphql', $arr) ? 'graphql' : 'data';

        if (empty($arr[$rootKey]['media_count'])) {
            return [];
        }

        $payload = $arr[$rootKey]['recent']['sections'];
        $nodes = array();
        foreach($payload as $p){
            $nodes = array_merge($nodes,$p['layout_content']['medias']);
        }

        foreach ($nodes as $mediaArray) {
            if ($index === $count) {
                return $medias;
            }
            $media = Media::create($mediaArray['media']);
            if (in_array($media->getId(), $mediaIds)) {
                return $medias;
            }
            if (isset($minTimestamp) && $media->getCreatedTime() < $minTimestamp) {
                return $medias;
            }
            $mediaIds[] = $media->getId();
            $medias[] = $media;
            $index++;
        }
        if (empty($nodes)) {
            return $medias;
        }
        $maxId = 1;
        $hasNextPage = 0;
    }
    return $medias;
}`

kesienajoel avatar May 20 '22 01:05 kesienajoel

The method recMediasByTag from @kesienajoel worked also for me. THANKS! However, I also had to add &__d=dis to the MEDIA_JSON_BY_TAG URL.

changed from: \InstagramScraper\Endpoints::MEDIA_JSON_BY_TAG = 'https://www.instagram.com/explore/tags/{tag}/?__a=1&max_id={max_id}' to: \InstagramScraper\Endpoints::MEDIA_JSON_BY_TAG = 'https://www.instagram.com/explore/tags/{tag}/?__a=1&max_id={max_id}&__d=dis'

vzz3 avatar Aug 01 '22 15:08 vzz3