YouTube-operational-API icon indicating copy to clipboard operation
YouTube-operational-API copied to clipboard

High latency on official instances

Open Benjamin-Loison opened this issue 2 years ago • 20 comments

TLDR: Hosting your own instance of the API solves the problem.

As these charts show, since about 04/24/2023, 17:00, both the no-key endpoint and web-scraping endpoints of the official instances are suffering of high latencies for unknown reasons.

Plots of historical latencies for no-key and web-scraping endpoints respectively:

quota

unusual

As of today (14/05/23), yt.lemnoslife.com redirects in a load-balancing manner to both yt{1, 2}.lemnoslife.com which don't suffer of #11 issue for unknown reason. Historically yt.lemnoslife.com redirected to yt0.lemnoslife.com but it suffers a lot of #11 issue.

Concerning solving #11 for yt0.lemnoslife.com there are several possibilities:

  • [ ] make yt.lemnoslife.com notably redirect to yt0.lemnoslife.com when it's not suffering of #11
  • [ ] contacting YouTube UI servers with a different IP each time (without using Tor) - I haven't much dug that.
  • [ ] try to solve recaptcha using AI (gave an interesting try with OpenAI whisper but recently seems to fail more and more), could also propose end-users to solve the captcha (audio for simplicity of implementation) but it's to be avoided as most as possible, as it requires human work.
  • [x] make people share their instances to have more to rely on, as requested.
  • [ ] proxy YouTube UI requests through Tor - currently need to rewrite code for using curl to use the Tor socks 5 proxy and using Tor network at such a scale is probably a bad idea.
  • [ ] add more instances on my own but I'm not aware of good free hosters (except Oracle that I already use as most as possible) and hosting at home degrates to much other web services running on the same machine.
  • [ ] #166
  • [ ] could also redirect to the no-key endpoint when the request permits it but this looks like an epic work.
  • [ ] could release the set of keys used for the no-key endpoint but it may help Google to take them down and it would maybe be even more against ToS. As an intermediary step, could propose an endpoint outputting a currently working YouTube Data API v3 key or could release a ready to be used YouTube Data API v3 key finder based on search engine, Stack Overflow etc but then these sources may be taken down. In the first case, could output keys known to have the most quota in a decreasing order but I have the feeling that if someone really wanted to dump all keys of the set multi-threading with calls to YouTube Data API v3 Search: list endpoint it could be doable (in addition to exhaust official instance no-key service quota). Could implement this comment also for the no-key endpoint, then redirecting to YouTube Data API v3 endpoints.
  • [ ] could also define a quota for end-users to better balance the official instance usage among users but that's something I want to avoid as most as possible.
  • [ ] using a lower level language such as Rust may help.

Note that some persons do stupid (except if justified) requests such as not exploiting the ability to pass multiple ids with the no-key endpoint. Could maybe try to focus on these stupid requests, firstly making statistics could be interesting.

Could also optimize performances.

I don't have yet clean ready to be analyzed number of requests per day since the establishment of my official instances.

Benjamin-Loison avatar May 14 '23 13:05 Benjamin-Loison

Maybe by adding keys to new both instances, we can't distinguish these instances and so we could use DNS load balancing to reduce latency (related to #69).

The only downside is that metrics should be reconsidered by doing so. Concerning instance sending being detected as sending unusual traffic metrics don't much need to be adapted as we test yt.lemnoslife.com every minute which redirects to a different instance randomly at every get call from requests, as such YouTube detection spans more than tens of minutes. Concerning no-key service, the different instances shared their keys, so if one is missing quota, then all are missing quota. To conclude metrics don't much need to be adapted.

Maybe adding domain name yt0.lemnoslife.com would make sense.

UPDATE:

I added yt0.lemnoslife.com, solved a problem that was causing #127 and added load balancing at DNS level.

Note that

curl https://yt.lemnoslife.com

redirects at each request potentially to a different instance. To fix the instance for test purpose, use:

curl --resolve yt.lemnoslife.com:443:INSTANCE_IP https://yt.lemnoslife.com

That way with probability:

  • 1/3, yt.lemnoslife.com resolves to yt0.lemnoslife.com that is currently detected as sending unusual traffic, so yt0.lemnoslife.com proceed to the answer by calling either yt0.lemnoslife.com or yt1.lemnoslife.com and returns this result to the end user
  • 2/3, yt.lemnoslife.com resolves either to yt0.lemnoslife.com or yt1.lemnoslife.com

So the recent work make yt.lemnoslife.com (and yt0.lemnoslife.com) work even if previously only end server yt0.lemnoslife.com is detected as sending unusual traffic and thanks to load balancing in 2/3 of cases the request processing takes the usual time and 1/3 it doubles this time.

Note that in theory we could even change the DNS to not propose instances that are detected as sending unusual traffic. However this would require checking regularly if a given instance is no more detected as sending unusual traffic, maybe thanks to a crontab it's easily and efficiently doable.

Note that I added an instance name to index.php to distinguish end instances used due to the load balancer (49d7552fe9911559c3c0712871871836c438b64c).


Concerning high latency, the following Python script

import os, datetime, matplotlib.pyplot as plt

path = '/home/benjamin/Desktop/'

os.chdir(path)

X, Y = [], []

with open('checkUnusualLogs.txt') as f:
    lines = f.read().splitlines()
    for line in lines:
        lineParts = line.split()
        linePart = lineParts[-1]
        if linePart.count('.') == 3:
            linePart = lineParts[-2]
        latency = float(linePart)
        date = datetime.datetime.strptime(' '.join(lineParts[:2]), '%m/%d/%Y, %H:%M:%S')
        X += [date]
        Y += [latency]

plt.title('Historical latencies for the web-scraping endpoints')
plt.xlabel('Request date')
plt.ylabel('Latency in seconds')
plt.plot(X, Y)
plt.show()

I added to my notification system an alert when the latency of a request exceeds 50 seconds as a first iteration. I'm also working on currently lowering current latencies.

~~Adding to logs which instance was reached would be nice.~~ Done.

Should draw the curves with a different color per instance and make axes clearer.


Concerning being able to use instances when they aren't detected as unusual to minimize latency both for the no-key service and the YouTube UI scraping one:

Note that flock (usable from crontab) is compatible with PHP flock (cf Example #1) and when one is locking, the other is waiting. The behavior is identical as previously described between two executions of the PHP script trying to lock, cf LOCK_NB comment for having a try lock behavior.

Could proceed as follows:

  • On user PHP request, if receive the error that the instance is detected as unusual, try locking to disable the instance in the DNS ~~and un-comment the crontab job described below (how to lock such that the machine user don't change the crontab for another purpose at the same time?)~~ (then we have to comment the crontab job again but then it requires mutex on every call as PHP doesn't remember if the instance is temporarily banned or not), finally redirect the request to another instance. Note that we could potentially avoid the first request to YouTube UI servers, by checking if we are under the lowest temporary ban period but then we don't lower this period anymore, if it's not a constant. Keeping logs of the described process could help have statistics about the temporary ban periods to optimize the process.
  • The 24h/24 crontab job consists in checking if the instance is still banned, if not enable with a (not try) lock the instance in the DNS. Could optimize by setting a timer to the lowest temporary ban period in the preceding item instead of regularly using crontab.

Could maybe optimize by using a memory only mutex mechanism.

Note that we try to work at DNS level to minimize latency and workload by not redirecting from an instance to the other but then we have the issue that the no-key service access is limited as the YouTube UI scraping endpoints because current URL and DNS scheme is working on a all or nothing instance approach, while it could be interesting to minimize the no-key service latency too.

Benjamin-Loison avatar May 14 '23 13:05 Benjamin-Loison

Note that on the charts we seem to notice horizontal lines every 60 seconds of latency, can it be related to the every minute crontab??

Benjamin-Loison avatar May 16 '23 19:05 Benjamin-Loison

An interesting question is what do we actually want to minimize: worst latency, best latency, average latency, median latency, cumulative latency?

Benjamin-Loison avatar May 16 '23 19:05 Benjamin-Loison

Could also rent private instances but it could be particularly against ToS.

Related to #94.

Benjamin-Loison avatar May 22 '23 22:05 Benjamin-Loison

Should monitor not a random end-server but all of them.

Benjamin-Loison avatar May 25 '23 21:05 Benjamin-Loison

About high latency (notably on https://travian.lemnoslife.com that is hosted on the same machine currently as https://yt.lemnoslife.com) but the CPU not being completely used and the network being not heavily used:

ps aux | grep apache | wc -l

returns the number found in /etc/apache2/mods-available/mpm_{worker,prefork,event}.conf, after sudo service apache2 reload the new value is taken into account. Increasing this value may reduce the latency but this is yet tested enough in practice to say so.

Should investigate above files more deeply.

Benjamin-Loison avatar Nov 07 '23 00:11 Benjamin-Loison

While I cannot guarantee having web-scraping and no-key endpoints working, I could favor the main page describing them to show the API features. Could have a dedicated instance having its endpoints disabled or could update README.md or similar documentation.

Comment following this Discord message.

Benjamin-Loison avatar Feb 22 '24 10:02 Benjamin-Loison

Maybe the issue is HTTPS like encryption is too CPU consumming, as I had the feeling that:

curl 'http://localhost/videos?part=mostReplayed&id=NIJ5RiMAmNs'

was working fine quickly (1, 2 seconds), while:

curl -k 'https://localhost/videos?part=mostReplayed&id=NIJ5RiMAmNs'

takes a while like 20 seconds.

Following this Discord comment I notice that access.log is not getting much updated, maybe because it is working hard keeping up with current requests.

Benjamin-Loison avatar Mar 02 '24 06:03 Benjamin-Loison

Sometimes I get:

The YouTube operational API instance `yt` `noKey/videos` is not working correctly!

but I do not get the same error message for the web-scraping endpoint, maybe because the instance is detected as abusing but it is not currently the case (not experiencing above message, just to make clear that it is not permanently banned at least). Noticing this is weird. If I remember correctly the response is actually empty and it is possibly due to the official instance overload.

Benjamin-Loison avatar Mar 09 '24 14:03 Benjamin-Loison

Other complaints about latency and even possibly time out: https://discord.com/channels/933841502155706418/933841503103627316/1280002118501138575.

Benjamin-Loison avatar Sep 02 '24 14:09 Benjamin-Loison

Could get rid of no-key service by just providing how I got the list of leaked YouTube Data API v3 keys (YouTube_Data_API_v3_key_web_scraper) and provide it itself. In theory I believe that YouTube has the right to terminate an API key being in a public place and it is not a complex process and I guess that from the key usage history they can trace back where I found it. I was a bit fearing from the legal aspect but with what I just stated it seems fine in my opinion, especially as just providing the list as is makes YouTube work to terminate them easier but in theory they can one by one terminate those powering my no-key service, it is just a bit more of work so doing so may break the no-key service.

wc -l /var/log/apache2/access.log
3179674 /var/log/apache2/access.log
grep -v '/noKey/' /var/log/apache2/access.log | wc -l
83318
grep '/noKey/' /var/log/apache2/access.log | wc -l
3096267

So 97.38 % of the requests are no-key service ones, so publishing keys.txt would make people host their own no-key service for their private or public usage and would less and less use the official instance. People complaining could that way have the alternative consisting in hosting their own no-key service.

Measuring approximative quota consumption may be interesting to guess precisely how many YouTube Data API v3 Search: list endpoint requests would be equivalent to current quota consumption. In fact we know that it is bounded by current number of requests that is more than 3 millions. So someone willing to make the public keys.txt useless would need to perform about 3 millions requests per day.

Note that making public keys.txt may make no-key only interested private instances owners no more willing to renew their instances. Could notice them or not if keys.txt become public, especially if no significant web-scraping usage.

Benjamin-Loison avatar Sep 02 '24 14:09 Benjamin-Loison

Could give a try to Wikipedia: Trusted Execution Environment even if I do not like them.

Benjamin-Loison avatar Sep 12 '24 11:09 Benjamin-Loison

Concerning homepage could save it as HTML without the ability to add a YouTube Data API v3 key to the no-key service.

Benjamin-Loison avatar Sep 22 '24 12:09 Benjamin-Loison

https://discord.com/channels/933841502155706418/933841503103627316/1292243696707698728

Benjamin-Loison avatar Oct 05 '24 22:10 Benjamin-Loison

ls /etc/apache2/mods-enabled/*mpm*
/etc/apache2/mods-enabled/mpm_prefork.conf  /etc/apache2/mods-enabled/mpm_prefork.load

Benjamin-Loison avatar Oct 05 '24 22:10 Benjamin-Loison

/etc/apache2/mods-enabled/mpm_prefork.conf:

# prefork MPM
# StartServers: number of server processes to start
# MinSpareServers: minimum number of server processes which are kept spare
# MaxSpareServers: maximum number of server processes which are kept spare
# MaxRequestWorkers: maximum number of server processes allowed to start
# MaxConnectionsPerChild: maximum number of requests a server process serves

StartServers            5
MinSpareServers         5
MaxSpareServers         10
MaxRequestWorkers       256
MaxConnectionsPerChild  0

Multiplying by 4 does not seem to help. Could automate exponential increase and test.

sudo reboot

solved the issue for the moment.

Benjamin-Loison avatar Oct 05 '24 22:10 Benjamin-Loison

ls /etc/apache2/mods-available/*mpm*
/etc/apache2/mods-available/mpm_event.conf  /etc/apache2/mods-available/mpm_prefork.conf  /etc/apache2/mods-available/mpm_worker.conf
/etc/apache2/mods-available/mpm_event.load  /etc/apache2/mods-available/mpm_prefork.load  /etc/apache2/mods-available/mpm_worker.load

Benjamin-Loison avatar Oct 05 '24 22:10 Benjamin-Loison

Related to #303#issuecomment-2351225183.

Benjamin-Loison avatar Oct 05 '24 22:10 Benjamin-Loison