actor-page-analyzer
actor-page-analyzer copied to clipboard
Apify actor that opens a web page in headless Chrome and analyzes the HTML and JavaScript objects, looks for schema.org microdata and JSON-LD metadata, analyzes AJAX requests, etc.
Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6. Commits 7efb22a 1.2.6 ef88b93 security notice for additional prototype pollution issue c2b9819 isConstructorOrProto adapted from PR bc8ecee test from prototype pollution PR See full...
Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.14.6 to 1.14.8. Commits 3d81dc3 Release version 1.14.8 of the npm package. 62e546a Drop confidential headers across schemes. 2ede36d Release version 1.14.7 of the npm package. 8b347cb...
I wanted to try this actor directly from market place (https://apify.com/apify/page-analyzer) but run into unhandled error on the first try. Log follows. ``` 2021-07-07T18:39:02.477Z ACTOR: Pulling Docker image from repository....
The code of this solution is very outdated and would deserve some major rewrites. We can keep the main logic of how data is processed, but handling of proxy, loading...
In the network, this gives 404 - https://apifier-key-value-store-prod.s3.amazonaws.com/7fMXPPEAuFdaWCvQt/OUTPUT?AWSAccessKeyId=AKIAJTQHBVH6QKNNBOIQ&Expires=1595574947&Signature=TQPjEY%2BVpXeqlC%2FrRAllX2kdQAc%3D
also this error is encountered not sure if that has anything to do with it or if it is 2 sep issues `================================ 2020-08-16T09:06:37.279Z '0.01s' 'analysisStarted' 2020-08-16T09:06:37.438Z '0.16s' 'scrapping started'...
I get the following error with the latest build: Error: Cannot find module 'puppeteer'. Did you you install the 'puppeteer' package? The 'puppeteer' package is automatically bundled when using apify/actor-node-chrome-*...
https://apify.com/page-analyzer is broken, the UI is forever spinning In the network, this gives 404 - https://apifier-key-value-store-prod.s3.amazonaws.com/7fMXPPEAuFdaWCvQt/OUTPUT?AWSAccessKeyId=AKIAJTQHBVH6QKNNBOIQ&Expires=1595574947&Signature=TQPjEY%2BVpXeqlC%2FrRAllX2kdQAc%3D
I'm getting started with Apify/scrapping and wanted to use page-analyzer as suggested by https://blog.apify.com/web-scraping-in-2018-forget-html-use-xhrs-metadata-or-javascript-variables-8167f252439c/ . However default example doesn't return any output : ``` from apify_client import ApifyClient # Initialize...
Bumps [minimist](https://github.com/minimistjs/minimist) from 1.2.5 to 1.2.8. Changelog Sourced from minimist's changelog. v1.2.8 - 2023-02-09 Merged [Fix] Fix long option followed by single dash [#17](https://github.com/minimistjs/minimist/issues/17) [Tests] Remove duplicate test [#12](https://github.com/minimistjs/minimist/issues/12) [Fix]...