wappalyzer
wappalyzer copied to clipboard
The CMS conundrum
Is your feature request related to a problem? Please describe.
I'd like to add CMS
as a category to pretty much all the technologies in the ecommerce
technology section, with a few exceptions. I see this idea has been rejected before, so wanted to discuss here before creating a pull request.
We'd like to use this data on httparchive.org to create a ranking of most used CMS's, but we can't do that and not include systems like Shopify.
Describe the solution you'd like
Add CMS
as a category to systems like Magento, Shopify, OpenCart.
Describe alternatives you've considered I've considered suggesting an overarching category, but that feels like overkill.
Speaking at someone working for a vendor in this space, I’d advocate for not conflating the two. While Shopify - and some other platforms - have light CMS functionality, others do not. (e.g. commercetools)
This is also why you see some sites implement both a CMS and an eCommerce playform (think Wordpress and Woocommerce but also Contentful and commercetools)
Which is why we should add CMS
as a category to the systems that do have that functionality.
I'm on the fence on this one. I prefer using only the primary category as it doesn't seem useful to list many ecommerce platforms twice in the browser extension and have the CMS page dominated by ecommerce platforms. Maybe we could track CMS functionality using new property (like we do with saas
and oss
, i.e. "cms": true
). It would only really serve the HTTP Archive though.
I'm working on the CMS chapter in this year's edition of the Web Almanac for HTTP Archive. I echo @jdevalk's sentiment and would love to see a more practical way to look at CMS data across the web.
For some perspective on my end, WordPress and Shopify are often compared as CMSs, with WordPress being compared primarily via WooCommerce, a popular plugin for WordPress. While I understand that Shopify is considered an ecommerce platform first, for end users if they're running on Shopify (and deciding between Shopify and WordPress) they're likely thinking about it as a CMS and for those interested in broader trends, WordPress and Shopify often warrant direct comparison as CMSs.
On the other hand, comparing WordPress vs. WooCommerce doesn't make much sense. Adding ecommerce platforms to the CMS category can be useful to compare CMS market share but not for suggesting alternatives, e.g. someone looking for a WordPress alternative would not pick Shopify.
What do you think of my suggestion to add a cms
feature flag? The HTTP Archive could use this flag to add the relevant ecommerce platforms to the CRM category upon import.
I think that's an improvement, so I'll take it :) @rviscomi could we use that easily?
Our pipeline is only set up to consume the technology, category, and info fields, so not very easily. @pmeenan WDYT?
@rviscomi @jdevalk if you can propose how you'd like it to show up in the resulting JSON, I'm happy to add it. Right now the detections are lists and each element in the list is a detected technology (the list itself is either flat globally or within the categories).
We could have the features as top-level groups with the technologies as a list within each which would work for true/false kind of features but not more freeform.
If you want attributes for each detected technology then the list itself would need to change and support each technology being a dictionary of attributes or something like that which may end up being harder to query.
It could also be done at query time if you want to build ad-hoc groupings of technologies instead of relying on the pre-defined categories.
@pmeenan Perhaps the simplest way is to look for the "cms": true
property in the JSON object and then add that technology to the CMS category on your end.
That could work though adding the layer of translation on extra categories feels strange.
Currently it looks like this:
"_detected": {
"CMS": "WordPress",
"Blogs": "WordPress",
"Databases": "MySQL",
"Programming languages": "PHP",
"Operating systems": "Ubuntu",
"WordPress plugins": "Imagely NextGEN Gallery",
"Photo galleries": "Imagely NextGEN Gallery",
"Web servers": "Nginx 1.18.0",
"Reverse proxies": "Nginx 1.18.0",
"Font scripts": "Twitter Emoji (Twemoji),Google Font API,Font Awesome",
"JavaScript libraries": "Slick,Lightbox,jQuery Migrate 3.3.2,jQuery 3.6.0"
},
"_detected_apps": {
"WordPress": "",
"MySQL": "",
"PHP": "",
"Ubuntu": "",
"Imagely NextGEN Gallery": "",
"Nginx": "1.18.0",
"Twitter Emoji (Twemoji)": "",
"Slick": "",
"Lightbox": "",
"jQuery Migrate": "3.3.2",
"jQuery": "3.6.0",
"Google Font API": "",
"Font Awesome": ""
},
I can add entries that have all of the raw details, it is just harder to query once it gets to BQ:
"_detected": {
"CMS": "WordPress",
"Blogs": "WordPress",
"Databases": "MySQL",
"Programming languages": "PHP",
"Operating systems": "Ubuntu",
"WordPress plugins": "Imagely NextGEN Gallery",
"Photo galleries": "Imagely NextGEN Gallery",
"Web servers": "Nginx 1.18.0",
"Reverse proxies": "Nginx 1.18.0",
"Font scripts": "Twitter Emoji (Twemoji),Google Font API,Font Awesome",
"JavaScript libraries": "Slick,Lightbox,jQuery Migrate 3.3.2,jQuery 3.6.0"
},
"_detected_apps": {
"WordPress": "",
"MySQL": "",
"PHP": "",
"Ubuntu": "",
"Imagely NextGEN Gallery": "",
"Nginx": "1.18.0",
"Twitter Emoji (Twemoji)": "",
"Slick": "",
"Lightbox": "",
"jQuery Migrate": "3.3.2",
"jQuery": "3.6.0",
"Google Font API": "",
"Font Awesome": ""
},
"_detected_technologies": {
"WordPress": {
"name": "WordPress",
"slug": "wordpress",
"categories": [
{
"id": 1,
"slug": "cms",
"groups": [
3
],
"name": "CMS",
"priority": 1
},
{
"id": 11,
"slug": "blogs",
"groups": [
3
],
"name": "Blogs",
"priority": 1
}
],
"confidence": 100,
"version": "",
"icon": "WordPress.svg",
"website": "https://wordpress.org",
"pricing": [
"low",
"recurring",
"freemium"
],
"cpe": "cpe:/a:wordpress:wordpress"
},
"MySQL": {
"name": "MySQL",
"slug": "mysql",
"categories": [
{
"id": 34,
"slug": "databases",
"groups": [
7
],
"name": "Databases",
"priority": 5
}
],
"confidence": 100,
"version": "",
"icon": "MySQL.svg",
"website": "http://mysql.com",
"pricing": [],
"cpe": "cpe:/a:mysql:mysql"
},
"PHP": {
"name": "PHP",
"slug": "php",
"categories": [
{
"id": 27,
"slug": "programming-languages",
"groups": [
9
],
"name": "Programming languages",
"priority": 5
}
],
"confidence": 100,
"version": "",
"icon": "PHP.svg",
"website": "http://php.net",
"pricing": [],
"cpe": "cpe:/a:php:php"
},
"Ubuntu": {
"name": "Ubuntu",
"slug": "ubuntu",
"categories": [
{
"id": 28,
"slug": "operating-systems",
"groups": [
7
],
"name": "Operating systems",
"priority": 6
}
],
"confidence": 100,
"version": "",
"icon": "Ubuntu.png",
"website": "http://www.ubuntu.com/server",
"pricing": [],
"cpe": null
},
"Imagely NextGEN Gallery": {
"name": "Imagely NextGEN Gallery",
"slug": "imagely-nextgen-gallery",
"categories": [
{
"id": 87,
"slug": "wordpress-plugins",
"groups": [
15
],
"name": "WordPress plugins",
"priority": 8
},
{
"id": 7,
"slug": "photo-galleries",
"groups": [
3,
10
],
"name": "Photo galleries",
"priority": 1
}
],
"confidence": 100,
"version": "",
"icon": "Imagely.png",
"website": "https://www.imagely.com/wordpress-gallery-plugin",
"pricing": [
"freemium",
"low",
"recurring",
"onetime"
],
"cpe": null
},
"Nginx": {
"name": "Nginx",
"slug": "nginx",
"categories": [
{
"id": 22,
"slug": "web-servers",
"groups": [
7
],
"name": "Web servers",
"priority": 8
},
{
"id": 64,
"slug": "reverse-proxies",
"groups": [
7
],
"name": "Reverse proxies",
"priority": 7
}
],
"confidence": 100,
"version": "1.18.0",
"icon": "Nginx.svg",
"website": "http://nginx.org/en",
"pricing": [],
"cpe": "cpe:/a:nginx:nginx"
},
"Twitter Emoji (Twemoji)": {
"name": "Twitter Emoji (Twemoji)",
"slug": "twitter-emoji-twemoji",
"categories": [
{
"id": 17,
"slug": "font-scripts",
"groups": [
9
],
"name": "Font scripts",
"priority": 9
}
],
"confidence": 100,
"version": "",
"icon": "default.svg",
"website": "https://twitter.github.io/twemoji/",
"pricing": [],
"cpe": null
},
"Slick": {
"name": "Slick",
"slug": "slick",
"categories": [
{
"id": 59,
"slug": "javascript-libraries",
"groups": [
9
],
"name": "JavaScript libraries",
"priority": 9
}
],
"confidence": 100,
"version": "",
"icon": "default.svg",
"website": "https://kenwheeler.github.io/slick",
"pricing": [],
"cpe": null
},
"Lightbox": {
"name": "Lightbox",
"slug": "lightbox",
"categories": [
{
"id": 59,
"slug": "javascript-libraries",
"groups": [
9
],
"name": "JavaScript libraries",
"priority": 9
}
],
"confidence": 100,
"version": "",
"icon": "Lightbox.png",
"website": "http://lokeshdhakar.com/projects/lightbox2/",
"pricing": [],
"cpe": "cpe:/a:lightbox_photo_gallery_project:lightbox_photo_gallery"
},
"jQuery Migrate": {
"name": "jQuery Migrate",
"slug": "jquery-migrate",
"categories": [
{
"id": 59,
"slug": "javascript-libraries",
"groups": [
9
],
"name": "JavaScript libraries",
"priority": 9
}
],
"confidence": 100,
"version": "3.3.2",
"icon": "jQuery.svg",
"website": "https://github.com/jquery/jquery-migrate",
"pricing": [],
"cpe": null
},
"jQuery": {
"name": "jQuery",
"slug": "jquery",
"categories": [
{
"id": 59,
"slug": "javascript-libraries",
"groups": [
9
],
"name": "JavaScript libraries",
"priority": 9
}
],
"confidence": 100,
"version": "3.6.0",
"icon": "jQuery.svg",
"website": "https://jquery.com",
"pricing": [],
"cpe": "cpe:/a:jquery:jquery"
},
"Google Font API": {
"name": "Google Font API",
"slug": "google-font-api",
"categories": [
{
"id": 17,
"slug": "font-scripts",
"groups": [
9
],
"name": "Font scripts",
"priority": 9
}
],
"confidence": 100,
"version": "",
"icon": "Google Font API.png",
"website": "http://google.com/fonts",
"pricing": [],
"cpe": null
},
"Font Awesome": {
"name": "Font Awesome",
"slug": "font-awesome",
"categories": [
{
"id": 17,
"slug": "font-scripts",
"groups": [
9
],
"name": "Font scripts",
"priority": 9
}
],
"confidence": 100,
"version": "",
"icon": "font-awesome.svg",
"website": "https://fontawesome.com/",
"pricing": [
"low",
"freemium",
"recurring"
],
"cpe": null
}
},
"_detected_raw": [
{
"name": "WordPress",
"slug": "wordpress",
"categories": [
{
"id": 1,
"slug": "cms",
"groups": [
3
],
"name": "CMS",
"priority": 1
},
{
"id": 11,
"slug": "blogs",
"groups": [
3
],
"name": "Blogs",
"priority": 1
}
],
"confidence": 100,
"version": "",
"icon": "WordPress.svg",
"website": "https://wordpress.org",
"pricing": [
"low",
"recurring",
"freemium"
],
"cpe": "cpe:/a:wordpress:wordpress"
},
{
"name": "MySQL",
"slug": "mysql",
"categories": [
{
"id": 34,
"slug": "databases",
"groups": [
7
],
"name": "Databases",
"priority": 5
}
],
"confidence": 100,
"version": "",
"icon": "MySQL.svg",
"website": "http://mysql.com",
"pricing": [],
"cpe": "cpe:/a:mysql:mysql"
},
{
"name": "PHP",
"slug": "php",
"categories": [
{
"id": 27,
"slug": "programming-languages",
"groups": [
9
],
"name": "Programming languages",
"priority": 5
}
],
"confidence": 100,
"version": "",
"icon": "PHP.svg",
"website": "http://php.net",
"pricing": [],
"cpe": "cpe:/a:php:php"
},
{
"name": "Ubuntu",
"slug": "ubuntu",
"categories": [
{
"id": 28,
"slug": "operating-systems",
"groups": [
7
],
"name": "Operating systems",
"priority": 6
}
],
"confidence": 100,
"version": "",
"icon": "Ubuntu.png",
"website": "http://www.ubuntu.com/server",
"pricing": [],
"cpe": null
},
{
"name": "Imagely NextGEN Gallery",
"slug": "imagely-nextgen-gallery",
"categories": [
{
"id": 87,
"slug": "wordpress-plugins",
"groups": [
15
],
"name": "WordPress plugins",
"priority": 8
},
{
"id": 7,
"slug": "photo-galleries",
"groups": [
3,
10
],
"name": "Photo galleries",
"priority": 1
}
],
"confidence": 100,
"version": "",
"icon": "Imagely.png",
"website": "https://www.imagely.com/wordpress-gallery-plugin",
"pricing": [
"freemium",
"low",
"recurring",
"onetime"
],
"cpe": null
},
{
"name": "Nginx",
"slug": "nginx",
"categories": [
{
"id": 22,
"slug": "web-servers",
"groups": [
7
],
"name": "Web servers",
"priority": 8
},
{
"id": 64,
"slug": "reverse-proxies",
"groups": [
7
],
"name": "Reverse proxies",
"priority": 7
}
],
"confidence": 100,
"version": "1.18.0",
"icon": "Nginx.svg",
"website": "http://nginx.org/en",
"pricing": [],
"cpe": "cpe:/a:nginx:nginx"
},
{
"name": "Twitter Emoji (Twemoji)",
"slug": "twitter-emoji-twemoji",
"categories": [
{
"id": 17,
"slug": "font-scripts",
"groups": [
9
],
"name": "Font scripts",
"priority": 9
}
],
"confidence": 100,
"version": "",
"icon": "default.svg",
"website": "https://twitter.github.io/twemoji/",
"pricing": [],
"cpe": null
},
{
"name": "Slick",
"slug": "slick",
"categories": [
{
"id": 59,
"slug": "javascript-libraries",
"groups": [
9
],
"name": "JavaScript libraries",
"priority": 9
}
],
"confidence": 100,
"version": "",
"icon": "default.svg",
"website": "https://kenwheeler.github.io/slick",
"pricing": [],
"cpe": null
},
{
"name": "Lightbox",
"slug": "lightbox",
"categories": [
{
"id": 59,
"slug": "javascript-libraries",
"groups": [
9
],
"name": "JavaScript libraries",
"priority": 9
}
],
"confidence": 100,
"version": "",
"icon": "Lightbox.png",
"website": "http://lokeshdhakar.com/projects/lightbox2/",
"pricing": [],
"cpe": "cpe:/a:lightbox_photo_gallery_project:lightbox_photo_gallery"
},
{
"name": "jQuery Migrate",
"slug": "jquery-migrate",
"categories": [
{
"id": 59,
"slug": "javascript-libraries",
"groups": [
9
],
"name": "JavaScript libraries",
"priority": 9
}
],
"confidence": 100,
"version": "3.3.2",
"icon": "jQuery.svg",
"website": "https://github.com/jquery/jquery-migrate",
"pricing": [],
"cpe": null
},
{
"name": "jQuery",
"slug": "jquery",
"categories": [
{
"id": 59,
"slug": "javascript-libraries",
"groups": [
9
],
"name": "JavaScript libraries",
"priority": 9
}
],
"confidence": 100,
"version": "3.6.0",
"icon": "jQuery.svg",
"website": "https://jquery.com",
"pricing": [],
"cpe": "cpe:/a:jquery:jquery"
},
{
"name": "Google Font API",
"slug": "google-font-api",
"categories": [
{
"id": 17,
"slug": "font-scripts",
"groups": [
9
],
"name": "Font scripts",
"priority": 9
}
],
"confidence": 100,
"version": "",
"icon": "Google Font API.png",
"website": "http://google.com/fonts",
"pricing": [],
"cpe": null
},
{
"name": "Font Awesome",
"slug": "font-awesome",
"categories": [
{
"id": 17,
"slug": "font-scripts",
"groups": [
9
],
"name": "Font scripts",
"priority": 9
}
],
"confidence": 100,
"version": "",
"icon": "font-awesome.svg",
"website": "https://fontawesome.com/",
"pricing": [
"low",
"freemium",
"recurring"
],
"cpe": null
}
],
I have 2 separate entries because in theory the raw resolved output can have duplicates of technologies and the dictionary version would only have the latest match.
@pmeenan Why not map "cms": true
to the CMS category like below? This would allow the HTTPArchive to have a slightly different categorisation for CMS-like technologies.
"Shopify": {
"cats": [ 6 ], // Ecommerce
"cms": true
}
"_detected": {
"Ecommerce": "Shopify",
"CMS": "Shopify"
},
"_detected_apps": {
"Shopify": ""
}
Now that we're reporting the full technology details for all of the detections, groupings could be done arbitrarily at query time or in the reporting pipeline. If "cms":true is added to the technology definition then it will pass through to the detected output and it can be queried directly.
I'd like to keep the WebPageTest output itself clean and true to the underlying Wappalyzer data.
While I understand that Shopify is considered an ecommerce platform first, for end users if they're running on Shopify (and deciding between Shopify and WordPress) they're likely thinking about it as a CMS
Obviously I don’t know who or what group(s) you’re thinking of but no-one I know puts Shopify in a CMS list. For my circles it seems pretty straight forward: eCommerce => Shopify CMS => Wordpress (Of course there are others but most people I know would put those at the top of each category. That doesn’t mean there can’t be overlap but they just wouldn’t be seen that way.) For my 2 :coin:
This issue is stale because it has been open for 90 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.