almanac.httparchive.org icon indicating copy to clipboard operation
almanac.httparchive.org copied to clipboard

Update the sample data

Open mgifford opened this issue 1 year ago • 1 comments

The methodology: https://almanac.httparchive.org/en/2022/methodology#dataset

Uses this simple example:

#standardSQL
# Sum of JS request bytes per page (2022)
SELECT
  percentile,
  _TABLE_SUFFIX AS client,
  APPROX_QUANTILES(bytesJs / 1024, 1000)[OFFSET(percentile * 10)] AS js_kilobytes
FROM
  `httparchive.summary_pages.2022_06_01_*`,
  UNNEST([10, 25, 50, 75, 90, 100]) AS percentile
GROUP BY
  percentile,
  client
ORDER BY
  client,
  percentile

But in 2024, doesn't seem we're using httparchive.summary_pages so what dataset do we want to provide in the sample data to highlight the methodology?

https://github.com/HTTPArchive/almanac.httparchive.org/blob/4504893f25b547467e84e13e8da8d5686307d482/src/templates/en/2022/methodology.html

Also, looks like we need to create all the 2024 pages.

mgifford avatar Aug 28 '24 02:08 mgifford

Yeah, seems the methodology page was last updated in 2022.

Starting from 2023 we are using all dataset, mainly all.pages and all.requests tables in HTTP Archive. Would be cool if you could create a PR with the suggested updates for this page.

P.S. for pages to be created the editors need to complete markdown PRs (example)

max-ostapenko avatar Oct 01 '24 18:10 max-ostapenko

Fixed in #3863

tunetheweb avatar Nov 12 '24 13:11 tunetheweb