php-phantomjs
php-phantomjs copied to clipboard
page automation
hi, can you guide how exactly i can do this using php and phantom js.
http://phantomjs.org/page-automation.html
hi Sarfroz,
You can do this via custom scripts. I managed to pulled it off, but ensure you have the [% autoescape false %] [% endautoescape %]
so you can get the URL passed from the php script.
The documentation is here: http://jonnnnyw.github.io/php-phantomjs/4.0/4-custom-scripts/
Example code below:
[% autoescape false %]
var page = require('webpage').create(); var fs = require('fs'); var url = '{{ input.getUrl() }}';
page.open(url, 'GET', '', function (status){
var content = page.content;
var path = '/home/steven/Code/phantomjs/logs/log_script11.txt';
fs.write(path, url, 'w');
fs.write(path, content, 'w+');
phantom.exit(1);
});
phantom.onError = function(msg, trace) { phantom.exit(1); };
[% endautoescape %]
I tried sir but not working. I am using Partial script injection but no luck. this is my working phantom js code
var page = require('webpage').create();
page.settings.userAgent = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36'; page.onInitialized = function() { page.evaluate(function() { delete window._phantom; delete window.callPhantom; }); }; page.open('https://xxxxxxx', function(status) { if (status !== 'success') { console.log('Unable to access network'); } else { var ua = page.evaluate(function() { return document.getElementById('iddoc').textContent; }); console.log(ua); } phantom.exit(); });
if I run it via phantomjs command directly it works ok, but the problem is that I have to write everytime js code to change the url value. I hope you can give some example of this method.
On Thu, May 4, 2017 at 4:02 PM, yipwt79 [email protected] wrote:
hi Sarfroz,
You can do this via custom scripts. I managed to pulled it off, but ensure you have the [% autoescape false %] [% endautoescape %]
so you can get the URL passed from the php script.
The documentation is here: http://jonnnnyw.github.io/php-phantomjs/4.0/4-custom-scripts/
Example code below:
[% autoescape false %]
var page = require('webpage').create(); var fs = require('fs'); var url = '{{ input.getUrl() }}';
page.open(url, 'GET', '', function (status){
var content = page.content;
var path = '/home/steven/Code/phantomjs/logs/log_script11.txt'; fs.write(path, url, 'w'); fs.write(path, content, 'w+'); phantom.exit(1);
});
phantom.onError = function(msg, trace) { phantom.exit(1); };
[% endautoescape %]
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jonnnnyw/php-phantomjs/issues/174#issuecomment-299149500, or mute the thread https://github.com/notifications/unsubscribe-auth/AL3kJqIs3XBim8e6Py0I5vqk6eQXNcykks5r2alNgaJpZM4M980Z .
hi Sarfroz,
Ok, I've tried out the script and it works, but I'll let you know what needs to be done:
- Do NOT enable debugging, because there's known bugs when this is enabled, the script will take a long time, with no response. You can refer to this issue here: https://github.com/jonnnnyw/php-phantomjs/issues/74
I think debugging is better done via the terminal, eg
phantomjs --debug=true myscript.proc
Therefore you can catch any problems here first.
-
I haven't tried partial scripts, only CUSTOM scripts, and I believe this is what you plan to do. Partial scripts is sort of over riding the partial scripts in the codes, so I think you really need to understand what JonnyW did. I didn't spend much time on this.
-
Make sure that your scripts have the right permission:
chmod 755 testing1.proc I am running Apache2 on Linux Ubuntu, so I also set: chown :www-data testing1.proc
- You'll need to be creative when returning data back to the caller PHP script. Define, and use a response.content object in the testing1.proc
var response = {content:null}; //declaring an object response response.content = 'my content here'; //assign the results you want to pass back console.log(JSON.stringify(response)); //output it in JSON format.
You will be able to get the results in PHP script via: $response->getContent();
Note that if you don't pass a valid JSON string, the app doesn't give you the content that you want.
- You can create a centralize phantomjs config file: === `{ /* Same as: --ignore-ssl-errors=true */ "ignoreSslErrors": true,
/* Same as: --max-disk-cache-size=1000 */ "maxDiskCacheSize": 1000,
/* Same as: --output-encoding=utf8 */ "outputEncoding": "utf8",
"cookiesFile" : "/home/steven/Code/phantomjs/cookies/cookies.txt" }`
ok said that, here's my PHP caller full script:
`<?php
//timer $start = microtime(true);
use JonnyW\PhantomJs\Client; use JonnyW\PhantomJs\DependencyInjection\ServiceContainer; use JonnyW\PhantomJs\Message\Request;
require_once 'vendor/autoload.php'; require_once 'config.php';
error_reporting(E_ALL);
$client = Client::getInstance(); //var_dump($client->getCommand());
$location = '/home/steven/Code/phantomjs/procedures/';
$serviceContainer = ServiceContainer::getInstance(); $procedureLoader = $serviceContainer->get('procedure_loader_factory')->createProcedureLoader($location);
$url = 'https://www.reddit.com/'; /*** the script testing1.proc is located under $location ***/ $fileName = 'testing1';
$client = Client::getInstance(); //$client->getEngine()->debug(true); //Hangs when enabled!!! $client->getEngine()->addOption('--config=/home/steven/Code/phantomjs/phantomjs-config.json'); $client->getEngine()->addOption("--web-security=no"); $client->getEngine()->addOption('--ssl-protocol=tlsv1');
//$client->getProcedureCompiler()->clearCache(); //$client->getProcedureCompiler()->disableCache(); //enableCache(), clearCache();
$client->setProcedure($fileName); $client->getProcedureLoader()->addLoader($procedureLoader); $request = $client->getMessageFactory()->createRequest(); //for custom scripts. $response = $client->getMessageFactory()->createResponse();
$request->setMethod('GET'); $request->setUrl($url);
try{
$client->send($request, $response);
//echo "\n==== log ==== \n" .$client->getLog() . "\n";
//print_r($response->getConsole()); // Array
echo print_R($response->getHeaders()) ;
echo "status = " . $response->getStatus() . "\n";
echo "content = " . $response->getContent() . "\n" ;
} catch(Exception $e){
echo "Error catch\n";
echo $e->getMessage();
var_dump($client->getLog());
//print_r($e->getErrors());
}
/*** timer end ***/ $stop = round(microtime(true) - $start, 5);
echo "time: {$stop}\n";
?>`
Here is the testing1.proc
`[% autoescape false %]
var page = require('webpage').create(); var url = '{{ input.getUrl() }}';
page.settings.userAgent = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36';
page.onInitialized = function() { page.evaluate(function() { delete window._phantom; delete window.callPhantom; }); };
page.open(url, function(status) {
if (status !== 'success') {
console.log('Unable to access network');
} else {
var ua = page.evaluate(function() {
return document.getElementById('siteTable').innerHTML;
});
//console.log(ua);
var response = {content:null};
response.content = ua
console.log(JSON.stringify(response));
}
phantom.exit();
});
[% endautoescape %]`
Ok, hope this helps.
Cheers
Works like as a charm. Thanks a lot for this kind of support :) Only I disabled these lines and still, it was working good: $client->getEngine()->addOption('--config=/home/steven/Code/phantomjs/ phantomjs-config.json'); $client->getEngine()->addOption("--web-security=no"); $client->getEngine()->addOption('--ssl-protocol=tlsv1');
On Sun, May 7, 2017 at 11:54 AM, yipwt79 [email protected] wrote:
hi Sarfroz,
Ok, I've tried out the script and it works, but I'll let you know what needs to be done:
- Do NOT enable debugging, because there's known bugs when this is enabled, the script will take a long time, with no response. You can refer to this issue here: #74 https://github.com/jonnnnyw/php-phantomjs/issues/74
I think debugging is better done via the terminal, eg
phantomjs --debug=true myscript.proc
Therefore you can catch any problems here first.
I haven't tried partial scripts, only CUSTOM scripts, and I believe this is what you plan to do. Partial scripts is sort of over riding the partial scripts in the codes, so I think you really need to understand what JonnyW did. I didn't spend much time on this. 2.
Make sure that your scripts have the right permission:
chmod 755 testing1.proc I am running Apache2 on Linux Ubuntu, so I also set: chown :www-data testing1.proc
- You'll need to be creative when returning data back to the caller PHP script. Define, and use a response.content object in the testing1.proc
var response = {content:null}; //declaring an object response response.content = 'my content here'; //assign the results you want to pass back console.log(JSON.stringify(response)); //output it in JSON format.
You will be able to get the results in PHP script via: $response->getContent();
Note that if you don't pass a valid JSON string, the app doesn't give you the content that you want.
- You can create a centralize phantomjs config file: === `{ /* Same as: --ignore-ssl-errors=true */ "ignoreSslErrors": true,
/* Same as: --max-disk-cache-size=1000 */ "maxDiskCacheSize": 1000,
/* Same as: --output-encoding=utf8 */ "outputEncoding": "utf8",
"cookiesFile" : "/home/steven/Code/phantomjs/cookies/cookies.txt" }` ok said that, here's my PHP caller full script:
`<?php
//timer $start = microtime(true);
use JonnyW\PhantomJs\Client; use JonnyW\PhantomJs\DependencyInjection\ServiceContainer; use JonnyW\PhantomJs\Message\Request;
require_once 'vendor/autoload.php'; require_once 'config.php';
error_reporting(E_ALL);
$client = Client::getInstance(); //var_dump($client->getCommand());
$location = '/home/steven/Code/phantomjs/procedures/';
$serviceContainer = ServiceContainer::getInstance(); $procedureLoader = $serviceContainer->get('procedure_loader_factory')-> createProcedureLoader($location);
$url = 'https://www.reddit.com/'; /*** the script testing1.proc is located under $location ***/ $fileName = 'testing1';
$client = Client::getInstance(); //$client->getEngine()->debug(true); //Hangs when enabled!!! $client->getEngine()->addOption('--config=/home/steven/Code/phantomjs/ phantomjs-config.json'); $client->getEngine()->addOption("--web-security=no"); $client->getEngine()->addOption('--ssl-protocol=tlsv1');
//$client->getProcedureCompiler()->clearCache(); //$client->getProcedureCompiler()->disableCache(); //enableCache(), clearCache();
$client->setProcedure($fileName); $client->getProcedureLoader()->addLoader($procedureLoader); $request = $client->getMessageFactory()->createRequest(); //for custom scripts. $response = $client->getMessageFactory()->createResponse();
$request->setMethod('GET'); $request->setUrl($url);
try{
$client->send($request, $response);
//echo "\n==== log ==== \n" .$client->getLog() . "\n";
//print_r($response->getConsole()); // Array
echo print_R($response->getHeaders()) ;
echo "status = " . $response->getStatus() . "\n";
echo "content = " . $response->getContent() . "\n" ;
} catch(Exception $e){
echo "Error catch\n";
echo $e->getMessage();
var_dump($client->getLog()); //print_r($e->getErrors());
}
/*** timer end ***/ $stop = round(microtime(true) - $start, 5);
echo "time: {$stop}\n";
?> ` Here is the testing1.proc
`[% autoescape false %]
var page = require('webpage').create(); var url = '{{ input.getUrl() }}';
page.settings.userAgent = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36';
page.onInitialized = function() { page.evaluate(function() { delete window._phantom; delete window.callPhantom; }); };
page.open(url, function(status) {
if (status !== 'success') {
console.log('Unable to access network');
} else {
var ua = page.evaluate(function() {
return document.getElementById('siteTable').innerHTML;
});
//console.log(ua);
var response = {content:null}; response.content = ua console.log(JSON.stringify(response));
}
phantom.exit();
}); [% endautoescape %]`
Ok, hope this helps.
Cheers
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jonnnnyw/php-phantomjs/issues/174#issuecomment-299685248, or mute the thread https://github.com/notifications/unsubscribe-auth/AL3kJngTFY0T7yZaZ-BTzdiGAukuDTEFks5r3WOMgaJpZM4M980Z .
@yipwt79 run your php and testing1.proc,result:
Array ( ) 1status = 0 content = string(0) "" time: 2.886
I tried php-phantom js and I have not enabled debug but still it freezes at some sites , any help ? I dont have custom scripts just default php-phantomjs