owid-grapher
owid-grapher copied to clipboard
Derive grapher.originUrl from the first listed topic
https://user-images.githubusercontent.com/13406362/169551419-ee303f1c-055b-4ae2-b945-8fd5936635a5.mp4
Follow-up to #1070 which brought the infrastructure to merge Wordpress categories and grapher tags under a single categorization system: parent <-> child relationships between charts and topics and between topics themselves.
To continue with the simplification effort, this PR adds some synchronisation between the Origin URL and the topics, based on the assumption that the Origin URL is often a link to a topic page (previously entry).
In the comment below is a list of unique Origin URLs that couldn't be resolved to a topic (some are posts, some are invalid, while some other surface possible categorization issues)
Out of the 2292 charts with an origin URL, 298 couldn't get matched to a topic from their origin URL.
This means that it is possible to derive a topic for 87% of the charts with an origin URL.
- [x] Mock WP API in tests
Next steps
- deploy faceted Algolia demo on an unlisted URL (with topics filled from the migration, but read-only. Algolia indexing turned on and running every week).
- evaluate whether grapher tags should be used as a source for initial topics settings, in combination with origin URL. This would mean reversing the migration to clear
config.topicsId
, and reapply a new one with an updated topics matching algorithm.
Migration log https://tinyco.re URLs have been omitted from the list but not from the total count.
{
'http://localhost:3030/health-meta',
'/incomes-across-distribution',
'/what-are-ppps',
'http://localhost:3030/air-pollution-post',
'http://localhost:3030/fertility',
'http://localhost:3030/bonheur-et-satisfaction',
'http://localhost:3030/data-debunking-trumps-paris-agreement-claims',
'/fish-and-overfishing',
'http://localhost:3030/chinese-turbulence-how-periods-of-political-reform-affect-the-carbon-intensity-of-economies',
'http://localhost:3030/structural-transformation-and-deindustrialization-evidence-from-todays-rich-countries',
'http://localhost:3030/',
'http://localhost:3030/global-renewables-are-growing-but-are-only-managing-to-offset-a-decline-in-nuclear-production',
'http://localhost:3030/the-link-between-life-expectancy-and-health-spending-us-focus',
'http://localhost:3030/london-air-pollution',
'/safest-sources-of-energy',
'http://localhost:3030/women-in-the-labor-force-determinants',
'/selection-of-gh-indicators',
'http://localhost:3030/world-after-capital-data-and-viz',
'http://localhost:3030/how-many-deaths-make-a-natural-disaster-newsworthy',
'http://localhost:3030/what-was-the-death-toll-from-chernobyl-and-fukushima',
'http://localhost:3030/female-labor-force-participation-key-facts',
'//localhost:3030/crop-yields',
'http://localhost:3030/growth-and-structural-transformation-are-emerging-economies-industrializing-too-quickly',
'http://localhost:3030/agricultural-land-by-global-diets',
'http://localhost:3030/yields-vs-land-use-how-has-the-world-produced-enough-food-for-a-growing-population',
'http://localhost:3030/world-population-future-eductation-now',
'http://localhost:3030/feed',
'http://localhost:3030/women-in-the-labor-force',
'http://localhost:3030/how-many-people-does-synthetic-fertilizer-feed',
'http://localhost:3030/genuine-savings-other-measures-of-savinginvestment',
'http://localhost:3030/antibiotic-resistance-from-livestock',
'https://sdg-tracker.org/',
'http://localhost:3030/mispy/sdgs/gender-equality',
'To allow comparisons between countries and over time this metric is age-standardized.',
'http://localhost:3030/how-and-why-econ-complexity',
'https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3073059',
'http://localhost:3030/millennium-development-goals',
'http://localhost:3030/why-do-women-live-longer-than-men',
'http://localhost:3030/faq-on-plastics',
'http://localhost:3030/democracy-health',
'http://localhost:3030/mammals',
'ourworldndata.org/biodiversity',
'http://localhost:3030/heart-disease',
'http://localhost:3030/suicides',
'http://localhost:3030/ethnographic-and-archaeological-evidence-on-violent-deaths',
'http://localhost:3030/food-waste',
'http://localhost:3030/co2-and-other-greenhouse-gas-emissionshttps://ourworldindata.org/co2-and-other-greenhouse-gas-emissions',
'http://localhost:3030/covid-models',
'http://localhost:3030/social-spending',
'ourworldinata.org/transport',
'ourworldindata.orrg/vaccination',
'ourworldinidata.org/hygiene',
'http://localhost:3030/covid-sweden-death-reporting',
'http://localhost:3030/palm-oil',
'http://localhost:3030/soil-lifespans',
'http://localhost:3030/us-states-vaccinations',
'http://localhost:3030/covid-vaccinations',
'ourworlddindata.org/co2-and-other-greenhouse-gas-emissions',
'http://localhost:3030/bioodiversity',
'http://localhost:3030/electricity-mix',
'http://localhost:3030/energy-mix',
'http://localhost:3030/habitat-loss',
'http://localhost:3030/hygiene',
'ourworldidata.org/fertilizers',
'http://localhost:3030/primary-secondary-education',
'/explorers/climate-change',
'ourwworldindata.org/land-use',
'http://localhost:3030/natural-resources'
}
A topic could not be derived from the origin URL for 298/2292 charts
Back to draft, some more thinking required.
One year anniversary! Closing 🥲