GearmanBundle
GearmanBundle copied to clipboard
doEnqueue and getStatus always create a new connection to gearmand
Is there any reason for creating a new connection for each enqueued job or each try to get a jobstatus? 44340/562347 [===>----------------------------------------------] 7% Elapsed: 4 mins [pid 5732] connect(9, {sa_family=AF_INET, sin_port=htons(4730), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 44341/562347 [===>----------------------------------------------] 7% Elapsed: 4 mins [pid 5732] connect(9, {sa_family=AF_INET, sin_port=htons(4730), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 44342/562347 [===>----------------------------------------------] 7% Elapsed: 4 mins [pid 5732] connect(9, {sa_family=AF_INET, sin_port=htons(4730), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) 44343/562347 [===>----------------------------------------------] 7% Elapsed: 4 mins [pid 5732] connect(9, {sa_family=AF_INET, sin_port=htons(4730), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
With the default behaviour I get issues with many jobs (like 3000 per second) because of an unreachable (busy or out of sockets/fd/whatever) jobserver (connect returns -EADDRNOTAVAIL).
Maybe something like this could be a better way: 219 private function doEnqueue(array $worker, $params, $method, $unique) 220 { 221 if(!$this->gearmanClient) { 222 $this->gearmanClient = new \GearmanClient(); 223 $this->assignServers($this->gearmanClient); 224 } 225 226 return $this->gearmanClient->$method($worker['job']['realCallableName'], $params, $unique); 227 }
But I have no clue how to check for a valid connection - I just found the ping() method...
Hmmm. very interesting...
With this implementation, just one Client instance is created. A new instance should be created for every server collection, so we could use an identity map pattern for that.
How about a configuration flag to enable/disable this performance optimization?
Sounds good, but I wonder how errors (socket closed) could be handled.
Another question: I tried to overload the methods, but all are private. Is there any need for private methods?
nops.
Could you please PR these changes? All methods should be protected instead of private, to allow override them all.
I think there is no need to provide a configuration flag for this optimization. Isn't it a bad idea to create a brand new GearmanClient instance / connection for each job that has to be queued?
Afaik Gearman choose only the last added server to enqueue a job, so we also don't need one GearmanClient per Server?
The configuration flag is just to provide a Back Compatibility.
When you build a new GearmanClient, you must add worker servers, and can change between worker definitions, so if you want to have one instance for every connection, you should be able to clean up all servers and add them again ( And I don't know any way of doing this )...
you can manage many remote servers as you need, this is why I am talking about an identity map.
Am I following you?
Currently the client retrieves all default servers, and I can't tell how any additional servers can be added to it since the Service\GearmanClient::addServer isn't called.
Problem is, if the annotations on the workers specify a server not present in the root config, by default the Service\GearmanClient won't ever be able to use it.
For this issue it would be easier to merge the default servers with any annotated servers. You could then add all of them to the client, and for this you only need a single instance of \GearmanClient.
I actually hit this problem today and noticed that we do create a worker per time we call runTasks() or doBackground job, which as pointed out in previous comments on higher throughput will create exceptions.
@mmoreram After reviewing the comments here and looking through the code as far as I can tell the GearmanClient right now will always be created for any servers defined. So instead of always creating a new client per call, would we not listen for any calls to addServer, clearServers,setServer and blank out the client there. Otherwise it will reuse the existing client. This should result in a significantly higher performance than the existing client per call.
Let me know your thoughts.