YouTube-operational-API icon indicating copy to clipboard operation
YouTube-operational-API copied to clipboard

Optimize performance

Open Benjamin-Loison opened this issue 3 years ago • 14 comments

See YouTube Data API v3 optimizing performance documentation.

Are adding compression making sense, as it is included in apache and curl by default, isn't it?

May think about using compressed parameter to decrease server workload, but I don't think that it is worth it.

Related to #27 and #35.

Benjamin-Loison avatar Nov 10 '22 18:11 Benjamin-Loison

Firsly knowing how to do without removing -H 'Accept-Encoding: gzip, deflate, br' from a cURL request and why gunzip doesn't work sometimes (when?). If only provide gzip as Accept-Encoding, it always correctly return data compatible with gunzip. It's a bit weird (maybe due to their relative overload) from YouTube API to not have a prefered compression method.

Accept-Encoding documentation.

curl -v 'https://www.googleapis.com/youtube/v3/playlistItems?part=snippet,contentDetails,status&playlistId=UUAcAnMF0OrCtUep3Y4M-ZPw&maxResults=50&key=AIzaSy...'

Having 166,527 bytes of content according to ls -l.

If add: -H 'Accept-Encoding: gzip, deflate, br' > a && gunzip -c a:

Total packets Length according to Wireshark for the Google API instance IP: 46640

Otherwise if add: > a && cat a:

Total packets Length according to Wireshark for the Google API instance IP: 187614 Total packets Length according to Wireshark for the Google API instance IP: 187713 Executed twice to verify the order of magnitude.

The question is do the API file_get_contents use compression? What are CPU overload of my instances to verify that it wouldn't be an unacceptable CPU overload.

I added to each crontab of official instances:

* * * * * (date && cat /proc/loadavg && cat /proc/meminfo | head -n 3) > health.txt

Benjamin-Loison avatar Dec 25 '22 20:12 Benjamin-Loison

Could also investigate HTTP Range header.

Benjamin-Loison avatar Mar 04 '23 15:03 Benjamin-Loison

Could also propose a maxResults and fields parameter, as requested on Discord maxResults and fields. Here is another Discord user expecting maxResults to work.

Benjamin-Loison avatar Mar 13 '23 13:03 Benjamin-Loison

Note that concerning channels?part=community it returns sometimes empty pages when using nextPageToken, as the YouTube UI, however according to amatis on Matrix it may happen with no more data after so we could try to find an optimization fix to avoid making a few empty requests at the end.

Benjamin-Loison avatar Mar 19 '23 07:03 Benjamin-Loison

Increase priority following this Discord message. Depending on the endpoint you are using there are maybe alternative webpages less bandwidth consuming to retrieve.

By the way membership: true I adapted it to my site and it was much faster than yours. If you want it to be faster, instead of this URL:

if ($options['membership']) {
    $result = getJSONFromHTML("[https://www.youtube.com/channel/$id"](https://www.youtube.com/channel/$id%22));

use this URL:

if ($options['membership']) {
   $result = getJSONFromHTML("[https://www.youtube.com/channel/$id/search"](https://www.youtube.com/channel/$id/search%22));

Because your URL: 880kb. My URL: 400kb. It can pull and query faster. 40% speed difference.

Source: private Discord message from (788496476187263026)

Benjamin-Loison avatar Nov 14 '23 12:11 Benjamin-Loison

To avoid making a first request to have a continuation token, being able to reverse-engineer this continuation token would improve performances, cf #190. This would possibly make a single case instead of a first HTML web-scraping and then JSON continuation.

Benjamin-Loison avatar Nov 24 '23 15:11 Benjamin-Loison

Currently tests are parsed during production delivery...

Benjamin-Loison avatar Dec 03 '23 04:12 Benjamin-Loison

Like in #258 can use browse YouTube UI endpoint to only retrieve JSON and not HTML containing JSON.

Benjamin-Loison avatar Mar 31 '24 20:03 Benjamin-Loison

Someone told me on Discord to only receive JSON thanks to YouTube UI browse endpoint, this seems related to #252.

Benjamin-Loison avatar May 01 '24 15:05 Benjamin-Loison

Regex do not seem compilable contrarily to Python re module, see https://www.php.net/manual-lookup.php?pattern=preg_compile.

Benjamin-Loison avatar Oct 11 '24 18:10 Benjamin-Loison

Maybe multithreading help.

Related to YouTube_Data_API_v3_key_web_scraper/issues/2.

Benjamin-Loison avatar Oct 11 '24 19:10 Benjamin-Loison

https://discord.com/channels/@me/1075498558259724362/1294495326635560982 old private instance owner requested this feature.

Benjamin-Loison avatar Oct 12 '24 03:10 Benjamin-Loison