granary icon indicating copy to clipboard operation
granary copied to clipboard

instagram scraped profile feeds don't include video mp4 URLs

Open snarfed opened this issue 8 years ago • 2 comments

...because instagram's embedded JSON in profile pages doesn't have the video_url field with the mp4 link. we'd need to fetch each video post permalink individually to get it.

totally doable, but i probably won't prioritize this until someone asks for it. just tracking for now. cc @aaronpk.

scraped news feeds (with session cookie) and individual permalinks do include the mp4 URLs.

snarfed avatar Dec 07 '17 16:12 snarfed

for comparison, here's a video node in the embedded JSON on https://www.instagram.com/thejohnnysmith/ , entry_data.ProfilePage[0].user.media.nodes[0]:

{
  "__typename": "GraphVideo",
  "id": "1663943750965504332",
  "comments_disabled": false,
  "dimensions": {
    "height": 750,
    "width": 750
  },
  "gating_info": null,
  "media_preview": "ACoqseeudoGece9RvbleTgA9v61pMiEiReD6ggA/X2rFurkzthTtUcZ9fc+3tWt+xFgaZEOOv05oW6jJ5Bx+FZztIrZbBB74q2sKtGJFIyc5U9seh70uYfKWWvFxlVwfc5H5cVGbpz6D2xVR3Cdsk+nT/PtSBZz0/nRcLD0uioMbHAPcc/p2p0Ww8B/wxj+fWmPHuHHBB/OoUjLcYz6dqi4y9dWwyFJIJA6emc9/eowmeMlQOABj8/xovhIjAMdwwMfl0qNpQeOoxwT+ePwpjJVjCduPbOf8aeLkY/8AriqkUpPyHj0NWTGhOT1/D/CkAjZxx1xTEYZycKQep7j+vrUbMQvXtTk+Yc849fpSQjRu4xJGOecja3qf8Md/aspSA2G4HGP8981pE/uh7Fv/AEGsQn+lUwLhhQ4I98il2L/z0x+FQA8f59KtKBgVLGf/2Q==",
  "owner": {
    "id": "654594"
  },
  "thumbnail_src": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-15/s640x640/e15/24839044_1500287820061417_8285790670526873600_n.jpg",
  "thumbnail_resources": [
    {
      "src": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-15/s150x150/e15/24839044_1500287820061417_8285790670526873600_n.jpg",
      "config_width": 150,
      "config_height": 150
    },
    "..."
  ],
  "is_video": true,
  "code": "BcXhAaKh6lM",
  "date": 1512577610,
  "display_src": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-15/e15/24839044_1500287820061417_8285790670526873600_n.jpg",
  "video_views": 3790,
  "caption": "Murica. Animated. #imagemanipulation #digitalart #animation #republicans",
  "comments": {
    "count": 10
  },
  "likes": {
    "count": 1024
  }
}

...and here's the same video node in the permalink, https://www.instagram.com/p/BcXhAaKh6lM/ , entry_data.PostPage[0].graphql.shortcode_media. so different it's not even worth diffing.

{
  "__typename": "GraphVideo",
  "id": "1663943750965504332",
  "shortcode": "BcXhAaKh6lM",
  "dimensions": {
    "height": 750,
    "width": 750
  },
  "gating_info": null,
  "media_preview": "ACoqseeudoGece9RvbleTgA9v61pMiEiReD6ggA/X2rFurkzthTtUcZ9fc+3tWt+xFgaZEOOv05oW6jJ5Bx+FZztIrZbBB74q2sKtGJFIyc5U9seh70uYfKWWvFxlVwfc5H5cVGbpz6D2xVR3Cdsk+nT/PtSBZz0/nRcLD0uioMbHAPcc/p2p0Ww8B/wxj+fWmPHuHHBB/OoUjLcYz6dqi4y9dWwyFJIJA6emc9/eowmeMlQOABj8/xovhIjAMdwwMfl0qNpQeOoxwT+ePwpjJVjCduPbOf8aeLkY/8AriqkUpPyHj0NWTGhOT1/D/CkAjZxx1xTEYZycKQep7j+vrUbMQvXtTk+Yc849fpSQjRu4xJGOecja3qf8Md/aspSA2G4HGP8981pE/uh7Fv/AEGsQn+lUwLhhQ4I98il2L/z0x+FQA8f59KtKBgVLGf/2Q==",
  "display_url": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-15/e15/24839044_1500287820061417_8285790670526873600_n.jpg",
  "display_resources": [
    {
      "src": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-15/s640x640/e15/24839044_1500287820061417_8285790670526873600_n.jpg",
      "config_width": 640,
      "config_height": 640
    },
    "..."
  ],
  "dash_info": {
    "is_dash_eligible": false,
    "video_dash_manifest": null,
    "number_of_qualities": 0
  },
  "video_url": "https://instagram.frir1-1.fna.fbcdn.net/vp/9073fc67f141289491118de09540bb9f/5A2C4B4D/t50.2886-16/24222831_239068156630392_3793391278482259968_n.mp4",
  "video_view_count": 3798,
  "is_video": true,
  "should_log_client_event": false,
  "tracking_token": "eyJ2ZXJzaW9uIjo1LCJwYXlsb2FkIjp7ImlzX2FuYWx5dGljc190cmFja2VkIjp0cnVlLCJ1dWlkIjoiNWM4M2I4NDYwY2I0NDczNmIxYmY2MjgzMWRiMDk3YmExNjYzOTQzNzUwOTY1NTA0MzMyIn0sInNpZ25hdHVyZSI6IiJ9",
  "edge_media_to_tagged_user": {
    "edges": []
  },
  "edge_media_to_caption": {
    "edges": [
      {
        "node": {
          "text": "Murica. Animated. #imagemanipulation #digitalart #animation #republicans"
        }
      }
    ]
  },
  "caption_is_edited": true,
  "edge_media_to_comment": {
    "count": 10,
    "page_info": {
      "has_next_page": false,
      "end_cursor": null
    },
    "edges": [
      {
        "node": {
          "id": "17897454409126928",
          "text": "@marina...",
          "created_at": 1512577690,
          "owner": {
            "id": "1039315351",
            "profile_pic_url": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-19/s150x150/23970161_157151168230471_1728751850200498176_n.jpg",
            "username": "bysk..."
          }
        }
      },
      "..."
    ]
  },
  "comments_disabled": false,
  "taken_at_timestamp": 1512577610,
  "edge_media_preview_like": {
    "count": 1025,
    "edges": [
      {
        "node": {
          "id": "1370321247",
          "profile_pic_url": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-19/s150x150/12523811_801579049964891_270399277_a.jpg",
          "username": "pxn..."
        }
      },
      "..."
    ]
  },
  "edge_media_to_sponsor_user": {
    "edges": []
  },
  "location": null,
  "viewer_has_liked": false,
  "viewer_has_saved": false,
  "viewer_has_saved_to_collection": false,
  "owner": {
    "id": "654594",
    "profile_pic_url": "https://instagram.frir1-1.fna.fbcdn.net/t51.2885-19/s150x150/24175492_141781853142224_1588976035087515648_n.jpg",
    "username": "thejohnnysmith",
    "blocked_by_viewer": false,
    "followed_by_viewer": false,
    "full_name": "Johnny Smith",
    "has_blocked_viewer": false,
    "is_private": false,
    "is_unpublished": false,
    "is_verified": false,
    "requested_by_viewer": false
  },
  "is_ad": false,
  "edge_web_media_to_related_media": {
    "edges": []
  }
}

snarfed avatar Dec 07 '17 16:12 snarfed

~~Do you have a clue on how to request it with the info provided in the profile json? (the one you get with links like https://www.instagram.com/loganpaul/?__a=1). I can see it's not including the video url, so how to relate that one with the actual video permalink to download the actual video?~~

Ok found it. The video Id would be code or short_code on the item.

JorgeCastilloPrz avatar Dec 31 '18 17:12 JorgeCastilloPrz

Closing as obsolete; granary's Instagram support is probably headed toward deprecation.

snarfed avatar Jul 26 '24 03:07 snarfed