NewPipeExtractor icon indicating copy to clipboard operation
NewPipeExtractor copied to clipboard

Add support for YouTube hashtag pages

Open AudricV opened this issue 4 years ago • 4 comments

If a video includes hashtags, YouTube redirects to a page which shows all videos associated with this hashtag. For example: https://www.youtube.com/hashtag/martingarrix (I used a music artist because it contains more than 100 videos).

Webpage screenshot: #martingarrix YouTube

These webpages aren't extracted by the extractor. It will be great if it can extract them, so apps can show these pages when clicking on a hashtag in a YouTube video (title and/or description).

Findings:

POST requests are made to https://www.youtube.com/youtubei/v1/browse?key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8 for initial response and continuation(s), if the hashtag has more than 100 videos.

Headers of the JSON initial request when browsing an hashtag page:

{
   "context":{
      "client":{
         "hl":"en-GB",
         "gl":"FR",
         "remoteHost":"2a01:cb1c:349:1800:81dc:f55b:f64f:2f08",
         "deviceMake":"",
         "deviceModel":"",
         "visitorData":"CgtoUUtZU0tSNTRPNCjtioKDBg%3D%3D",
         "userAgent":"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36,gzip(gfe)",
         "clientName":"WEB",
         "clientVersion":"2.20210324.02.00",
         "osName":"Windows",
         "osVersion":"6.3",
         "originalUrl":"https://www.youtube.com/hashtag/martingarrix",
         "platform":"DESKTOP",
         "clientFormFactor":"UNKNOWN_FORM_FACTOR",
         "timeZone":"Europe/Paris",
         "browserName":"Chrome",
         "browserVersion":"89.0.4389.90",
         "screenWidthPoints":1366,
         "screenHeightPoints":625,
         "screenPixelDensity":1,
         "screenDensityFloat":1,
         "utcOffsetMinutes":120,
         "userInterfaceTheme":"USER_INTERFACE_THEME_LIGHT",
         "mainAppWebInfo":{
            "graftUrl":"/hashtag/martingarrix",
            "webDisplayMode":"WEB_DISPLAY_MODE_BROWSER"
         }
      },
      "user":{
         "lockedSafetyMode":false
      },
      "request":{
         "useSsl":true,
         "internalExperimentFlags":[],
         "consistencyTokenJars":[]
      },
      "adSignalsInfo":{
         "params":[
            {
               "key":"dt",
               "value":"1616938354083"
            },
            {
               "key":"flash",
               "value":"0"
            },
            {
               "key":"frm",
               "value":"0"
            },
            {
               "key":"u_tz",
               "value":"120"
            },
            {
               "key":"u_his",
               "value":"7"
            },
            {
               "key":"u_java",
               "value":"false"
            },
            {
               "key":"u_h",
               "value":"768"
            },
            {
               "key":"u_w",
               "value":"1366"
            },
            {
               "key":"u_ah",
               "value":"728"
            },
            {
               "key":"u_aw",
               "value":"1366"
            },
            {
               "key":"u_cd",
               "value":"24"
            },
            {
               "key":"u_nplug",
               "value":"3"
            },
            {
               "key":"u_nmime",
               "value":"4"
            },
            {
               "key":"bc",
               "value":"31"
            },
            {
               "key":"bih",
               "value":"625"
            },
            {
               "key":"biw",
               "value":"1349"
            },
            {
               "key":"brdim",
               "value":"0,0,0,0,1366,0,1366,728,1366,625"
            },
            {
               "key":"vis",
               "value":"1"
            },
            {
               "key":"wgl",
               "value":"true"
            },
            {
               "key":"ca_type",
               "value":"image"
            }
         ]
      }
   },
   "browseId":"FEhashtag",
   "params":"6gUOCgxtYXJ0aW5nYXJyaXg%3D"
}
Response of the JSON initial request when browsing an hashtag page:

JSON Response Browse request.zip

Headers of the JSON continuation request when browsing an hashtag page:

{
   "context":{
      "client":{
         "hl":"en-GB",
         "gl":"FR",
         "remoteHost":"2a01:cb1c:349:1800:81dc:f55b:f64f:2f08",
         "deviceMake":"",
         "deviceModel":"",
         "visitorData":"CgtoUUtZU0tSNTRPNCjtioKDBg%3D%3D",
         "userAgent":"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36,gzip(gfe)",
         "clientName":"WEB",
         "clientVersion":"2.20210324.02.00",
         "osName":"Windows",
         "osVersion":"6.3",
         "originalUrl":"https://www.youtube.com/hashtag/martingarrix",
         "platform":"DESKTOP",
         "clientFormFactor":"UNKNOWN_FORM_FACTOR",
         "timeZone":"Europe/Paris",
         "browserName":"Chrome",
         "browserVersion":"89.0.4389.90",
         "screenWidthPoints":1366,
         "screenHeightPoints":625,
         "screenPixelDensity":1,
         "screenDensityFloat":1,
         "utcOffsetMinutes":120,
         "userInterfaceTheme":"USER_INTERFACE_THEME_LIGHT",
         "connectionType":"CONN_CELLULAR_4G",
         "mainAppWebInfo":{
            "graftUrl":"https://www.youtube.com/hashtag/martingarrix",
            "webDisplayMode":"WEB_DISPLAY_MODE_BROWSER"
         }
      },
      "user":{
         "lockedSafetyMode":false
      },
      "request":{
         "useSsl":true,
         "internalExperimentFlags":[
            
         ],
         "consistencyTokenJars":[
            
         ]
      },
      "adSignalsInfo":{
         "params":[
            {
               "key":"dt",
               "value":"1616938354083"
            },
            {
               "key":"flash",
               "value":"0"
            },
            {
               "key":"frm",
               "value":"0"
            },
            {
               "key":"u_tz",
               "value":"120"
            },
            {
               "key":"u_his",
               "value":"8"
            },
            {
               "key":"u_java",
               "value":"false"
            },
            {
               "key":"u_h",
               "value":"768"
            },
            {
               "key":"u_w",
               "value":"1366"
            },
            {
               "key":"u_ah",
               "value":"728"
            },
            {
               "key":"u_aw",
               "value":"1366"
            },
            {
               "key":"u_cd",
               "value":"24"
            },
            {
               "key":"u_nplug",
               "value":"3"
            },
            {
               "key":"u_nmime",
               "value":"4"
            },
            {
               "key":"bc",
               "value":"31"
            },
            {
               "key":"bih",
               "value":"625"
            },
            {
               "key":"biw",
               "value":"1349"
            },
            {
               "key":"brdim",
               "value":"0,0,0,0,1366,0,1366,728,1366,625"
            },
            {
               "key":"vis",
               "value":"1"
            },
            {
               "key":"wgl",
               "value":"true"
            },
            {
               "key":"ca_type",
               "value":"image"
            }
         ]
      },
      "clientScreenNonce":"MC4yMzU0NzI4NjIwNzM3MTg5OA..",
      "clickTracking":{
         "clickTrackingParams":"CBsQ8eIEIhMI79mJyZXT7wIV0IhVCh3O0wDa"
      }
   },
   "continuation":"4qmFsgLSDBIJRkVoYXNodGFnGgZDRHclM0Q6vAxvdHJNMlFtbUNRcVZDUkw4Q0JJTkkyMWhjblJwYm1kaGNuSnBlQnJtQ0ZORWVVTkJVWFJXVGtkU2NHSkZWWGhhVjJjMVl6UkpRa015WkVSWFYwNUpaV3BLY2s1WVozZG5aMFZNVlVjMWQxaDZaRXBaV0VaWVRucFRRMEZSZEd0bGEyaHJZbnBTTldWSGJHdFpORWxDUXpGd01rMVdSbGRPYlhoNVdURTVXbWRuUlV4VVNFSnhXVEl3ZUZKcWFEQlhWR2xEUVZGek5XUnJNVzlQVjFrd1RWaENlRkpaU1VKRGVtc3hZV3R3TUZORldqVlZXR1JPWjJkRlRFMVhjRkpYV0VveVVtMVpNVlJyYlVOQlVYUXdZek5DVDJGNlRsUmtNVzgxWXpSSlFrTXdkSFZVUkVwVFUyeHdWVnBGUlRCblowVk1WakIwTVZsWVZuRlRWV2hEVmtSVFEwRlJkRU5TUnpscVkwTXhWMk5GVGpOWFdVbENRekJ3TkdWcmRFOVRSMXBQVlcxU1NtZG5SVXhoYkVKWVlraG9ibGR1YnpSTlJrZERRVkYwV0ZaSVNrZFpWR3hKV0RKNGQyRTBTVUpETUVac1pEQTFhMDFxYkROVmJGWk9aMmRGVEZac09YWk5SR1JHWWxaYU5VMHdSME5CVVhSVVlqTkpNazVFVm05aE1rNVJXVFJKUWtONlFUSmFWazU2VkRGa2FsTXhiRUpuWjBWTVlXMUpNVk14VWxoVFJFSjFVV3BEUTBGUmRFVmxWM2cyVWpGb1JsZ3liR2xXV1VsQ1F6QktWazU2YkVabFZURnhVMWhLVm1kblJVeGlSV1JFWW5wb1NsUklXbWhrVlcxRFFWRjBhazB6VmpSVVJXaDFUWGt4UzFWWlNVSkRNSGhHWVVSc1IwNXFaR0ZPVnpRMFoyZEZUR013Wkd0VlJYUm9ZVEZTVlUxVVEwTkJVWFJGWkZWYVZtUkZkelJsYkZaQ1lUUkpRa042VmxKTlJrcE5WMjA1Tm1GVE1XcG5aMFZNWVZab1NsSklVbTFOV0dSUlRVZGxRMEZSZEZGU2EzY3laRVJqZDFOdE5ESlhXVWxDUXpOT2JWVjZVa1pYVkUxM1YxaFNUbWRuUlV4amVtTjVVbFU1ZEZVeFduUmFWVEpEUVZGMGMxSnFVbXRWTWpVelZrZHNTMDlKU1VKRE1qa3lVa2RPVFZOSFJYbFVSRVoyWjJkRlRFOVdhRXhXYlZwellrVjBSbUpVYVVOQlVYUktWRlZyZDFkV1dqRlVNVkl3WXpSSlFrTXpSalJTYkZKTVVraG9ObU15ZEZKblowVk1ZMVp2ZUdOclNtRmFXSEJLVWtadFEwRlJkRWhXUjBaUVYwVlNkRkpFVG5oaE5FbENRek53YkZwSGNHbFZNMmgzWTIxS2JtZG5SVXhrYXpCNVpVWmthazFGV1hoa01ESkRRVkZ6TW1WRmFITlBTSEJxVkZVMWVrNUpTVUpETVZaVVRsUmFiR1ZyVGtOak1HUlNaMmRGVEUxRmNIUlRSMUo1V20wNU5sWkdSME5CVVhScVVWWmFkbGt5TVhCalZYTXhXVFJKUWtNd09YQk5iV1JxVVRCR01rOURNVE5uWjBWTVpERmtabGd3T1V4Vk0yaDRWVEJIUTBGUmRIcFhWVEV4Vmtkb01FNVVaRTlpTkVsQ1F6QmFhRmRxYkU1a1JYQm1UMGRPUm1kblJVeGpia0Y2VFZZNWNVOVhkSFZVVlcxRFFWRjBXVlJIWkVoTmExa3dUbXhzZW1FMFNVSkRNR1JLVjFWc2IxTlVUa0phTWtwT1oyZEZURm95U2xSVWEyOTNURmRhUkdGSGRVTkJVWE41VkZZNVEyVnNZM2hoYkVaMlRVbEpRa013VmtWTk1uQlFVVEZHVmxScmNFNW5aMFZNWTJzMGVXSldRWHBWUkdnd1VUQnRRMEZSZEdoV1JYUnFZak5WTlZSc1dURk5TVWxDUXpOR1RHTklaRmRXVjFKUlVtdHNibWRuUlV4alZYaDBVMnRLUzJWc1VtbFZSMk1sTTBRZ0FGb0FJaFJpY205M2MyVXRabVZsWkVaRmFHRnphSFJoWnhJTWJXRnlkR2x1WjJGeWNtbDQ%3D"
}
Response of the JSON continuation request when browsing an hashtag page:

JSON Response Browse continuation request.zip

AudricV avatar Mar 28 '21 14:03 AudricV

@TiA4f8R this should be implemented as a search filter, I think

Stypox avatar Mar 28 '21 16:03 Stypox

Yes, I was thinking that too.

AudricV avatar Mar 28 '21 16:03 AudricV

image

You need to specify the parameter as a Protobuf field. I think adding support for this is technically infeasible without the Protobuf library.

FireMasterK avatar Aug 06 '21 20:08 FireMasterK

@FireMasterK Not really, we can use the resolve_url endpoint of the InnerTube API, which give us everything we need!

Request URL: https://www.youtube.com/youtubei/v1/navigation/resolve_url?key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8

Request body:

{
  "context": {
    "client": {
      "hl": "en-GB",
      "gl": "US",
      "clientName": "WEB",
      "clientVersion": "2.20220124.01.00"
    }
  },
  "url": "https://www.youtube.com/hashtag/martingarrix"
}

Which returns (responseContext object removed):

{
  "endpoint": {
    "clickTrackingParams": "value_removed",
    "commandMetadata": {
      "webCommandMetadata": {
        "url": "/hashtag/martingarrix",
        "webPageType": "WEB_PAGE_TYPE_BROWSE",
        "rootVe": 6827,
        "apiUrl": "/youtubei/v1/browse"
      }
    },
    "browseEndpoint": {
      "browseId": "FEhashtag",
      "params": "6gUOCgxtYXJ0aW5nYXJyaXg%3D"
    }
  }
}

AudricV avatar Jan 24 '22 10:01 AudricV