Requests not asyncronous
Hi there,
I've been testing this library, and I've found through running your test script, and using another script I wrote with the library, that the requests are not asynchronous.
In all my tests, I have found while looping over an array and running startRequest() in each iteration, that each request doesn't start until the last one finishes.
Your suggested test mechanism of changing the max requests has not affect as on load time.
Is there any server side options required to enable curl_multi?
-Oliver
Hi Oliver, I'm sorry to hear that you're hitting problems! I haven't encountered this particular one, but it does sound like the underlying curl_multi isn't doing the right thing as you mention. One approach to debugging it would be to try some other curl_multi example code and see what results you get.
Pete
Thanks for the response Pete. That's a good call. I'm running some tests with curl_multi_exec() directly. I'll let you know if I discover anything interesting/useful.
I thought I'd let you know, I was able to get my case working using the curl_multi_exec(). As you put it, the interface is in fact "painfully confusing". That said, if I figure out why my use case wouldn't work I'll be sure to report and contribute if appropriate. Your library really fills a gap in functionality.
Any updates on this issue? Having the same problem here with the class. I just ran a test script and startRequest is blocking until it gets a response thus defeating the entire purpose of ParallelCurl. The test makes two requests(using startRequest) to a page that sleeps for a few seconds, then echos at various points showing the order of execution.
Test script:
<?php
//test page for test requests instead of using google
if(isset($_GET['sleep'])) {
$time_limit = ini_get('max_execution_time');
$time = (isset($_GET['t'])) ? $_GET['t'] : $time_limit;
set_time_limit($time+1);
sleep($time);
echo '*page response*';
exit; //prevents endless loop of requests
}
Class ScriptTimer { //simple timer class
private $_start_time;
public function __construct() {
$this->_start_time = microtime(true);
}
public function stop() { //returns time(float) in seconds elapsed since start_time
$end_time = microtime(true);
$execution_time = rtrim(sprintf('%.20F', ($end_time - $this->_start_time)), '0') . "secs"; //convert float to string
return $execution_time;
}
}
///////////////////////////Start ParallelCurl Class
include_once('ParallelCurl.php');
/////////////////////////////End ParallelCurl Class
$pc = new ParallelCurl(10);
$times = 0;
function process_pc($content, $url, $ch, $user_data) { //callback
global $times;
// echo $content .'<br>';
$request_time = $user_data['timer']->stop();
echo $request_time*1000 .'ms<br>';
$times += $request_time; //to get sum of all request times
}
$server = $_SERVER['HTTP_HOST'];
$current_page = $_SERVER['SCRIPT_NAME'];
$urls_to_request = [
'http://'. $server . $current_page. '?sleep&t=1',
'http://'. $server . $current_page. '?sleep&t=1',
'http://'. $server . $current_page. '?sleep&t=1',
'http://'. $server . $current_page. '?sleep&t=1'
];
$timer = new ScriptTimer;
foreach($urls_to_request as $url) {
$pc->startRequest($url, 'process_pc', ['timer' => new ScriptTimer]);
}
echo "ALL 'startRequest()'s FINISHED RUNNING <br>";
$pc->finishAllRequests();
echo "<b>ALL REQUESTS COMPLETED</b> <br>";
echo 'Total time: '. $timer->stop()*1000 .'ms<br>';
echo 'Sum of all requests: '. $times*1000 .'ms<br>'; //should theoretically be greater than total if requests sent in parallel
?>
Ran this on shared hosting server and it works as expected consistently. but when running this on localhost(MAMP) it usually runs synchronously. If I wait a few minutes before running it again it will send the requests in parallel but after that it goes back to running synchronously. I'm not aware of any dependencies that curl_multi relies on or anything external that could be causing this.
I wound up just using the underlying curl_multi_exec method. One possibility I considered is that I was trying to use the method inside of a method inside of a codeigniter class. That said, I set the callback method variable to NULL, as I didn't need it anyway, and wasn't sure how that declaration would work inside of a method.
Can you run my test script a few times on the environment that you were originally having the problems and see what it says? It uses the ParallelCurl class directly and it's still not running asynchronously for me. If using the underlying curl_multi_exec method works and this doesn't then the problem is with ParallelCurl.
Also if you could post a test script using curl_multi directly that would be great because I don't quite understand how it works yet.
Pete, can you please look into this? People are saying that ParallelCurl has not been working correctly since php 5.5. I really want to use this rather than RollingCurl as it's quite a bit simpler and the fact that RollingCurl doesn't support user data. Would really appreciate it.
Here is the example I used for curl_multi_exec() which is working as desired on PHP5.5.
http://rustyrazorblade.com/2008/02/curl_multi_exec/
Unfortunately I'm no longer using PHP, so I can't test or debug the problem. If it's confirmed that this is broken in PHP 5.5, I'll update the README to warn potential users, but otherwise I'm reliant on the PHP community to help out on this I'm afraid.
Pete I'm not exactly sure why it's not working anymore but I think it may have something to do with running the curl_multi loop in startRequest without expecting it to block, again not sure.
I have a derivative of RollingCurlMini that I'm using now as a replacement; working so far. It offers very similar functionality to ParallelCurl such as "user data" but runs slightly differently(send requests at once with execute() instead of one at a time like with startRequest()). Hopefully this can help out the people that are having problems using ParallelCurl. I'll be adding it to my github in the near future, but for now I'll post the code below.
Edited 4/12/15: Code/class that was here turned into the RollingCurlX, so I'm removing it for redundancy reasons.
Check it out at: github.com/marcushat/RollingCurlX
Thanks Marcus. If you get a chance to break that out into a very simple Github project, I'll be happy to link to it from the README.
Cool, I'd appreciate that. I'll let you know when I put it up.
Is there any fix to it?
Hey Pete, took me a while but I just published it. The project is called RollingCurlX. https://github.com/marcushat/rollingcurlx Hopefully it can help a few people out.
Thanks Marcus! I've updated the README with a prominent notice pointing to that project, hopefully it does help.
Just out of interest, What was causing this ? It seems fine on some PHP versions and not on others.
I'm not using PHP any more, so I don't know what the underlying cause of the problem was.
Sorry was more aimed at @marcushat
I'm not too sure, Curl-multi is a lot more confusing than it should be and I don't entirely get whats going on there.
I can send a Pull Request to fix if your interested ? Its blocking in a loop when there's nothing to do. It doesnt really even need to check where it is checking.