resource-timing icon indicating copy to clipboard operation
resource-timing copied to clipboard

Add information about local cache performance

Open bmaurer opened this issue 8 years ago • 6 comments

Data collected by Facebook here:

https://groups.google.com/a/chromium.org/d/msg/loading-dev/QUaDWQKPvZ0/BwqkHfkxBQAJ

Suggests that the performance of the local cache is a substantial contributor to overall page load time. resource timing should explicitly document cache performance (eg disk access time for hits and misses) so that more specific and actionable measurements can be made

bmaurer avatar Nov 09 '16 03:11 bmaurer

While I agree that this data would be fascinating, I suspect that the privacy breaching aspects of it may not be well received. This is not my area of expertise, but seems to me that would significantly increase the ability to fingerprint users.

yoavweiss avatar Nov 09 '16 15:11 yoavweiss

Ben, can you enumerate more specifically exactly what kind of data you're looking for and the motivation for exposing it? I.e. enumerate the use cases.. I agree with Yoav that this is a sensitive area, but before we "yay or nay", let's understand what we're trying to solve.

igrigorik avatar Nov 09 '16 15:11 igrigorik

At a most basic level I'd like to get timing information from the local cache. For example, if a request was a cache miss how long did it take to determine that the request was a miss. Measuring this will help us attribute performance problems due to local disk performance and help us work with browser vendors.

It could also be useful to add metadata about the cache entry itself. A very basic field would be "time at which the entry was cached". I don't think this is privacy sensitive since this could be determined if the origin injected a timestamp into the file, or by using CORS to read the date header.

bmaurer avatar Nov 14 '16 19:11 bmaurer

At a most basic level I'd like to get timing information from the local cache. For example, if a request was a cache miss how long did it take to determine that the request was a miss. Measuring this will help us attribute performance problems due to local disk performance and help us work with browser vendors.

This seems very architecture and implementation specific. For example, one could maintain an index in memory and determine that a particular request will be a miss very quickly.. Whereas another browser might have to page in some data to answer that query, etc. I don't think we'd want to expose that level of detail; this seems like a noise generator as vendors are free to change how the cache is implemented, etc -- e.g. we've already done so a few times in Chrome. FWIW, I understand where you're coming from here and why you want it.. but for the purpose of JS-visible RUM APIs, I think looking at aggregate data across many sessions is the right approach here; once you detect a an anomaly on the aggregate data, ping the browser in question to investigate deeper.

It could also be useful to add metadata about the cache entry itself. A very basic field would be "time at which the entry was cached". I don't think this is privacy sensitive since this could be determined if the origin injected a timestamp into the file, or by using CORS to read the date header.

Well, it is privacy sensitive, but I agree.. possible to obtain today if you control the resource, or if the origin gives you the right CORS bits to read its headers. What would you do with this data?

igrigorik avatar Nov 14 '16 22:11 igrigorik

This seems very architecture and implementation specific. For example, one could maintain an index in memory and determine that a particular request will be a miss very quickly.. Whereas another browser might have to page in some data to answer that query, etc. I don't think we'd want to expose that level of detail; this seems like a noise generator as vendors are free to change how the cache is implemented, etc -- e.g. we've already done so a few times in Chrome. FWIW, I understand where you're coming from here and why you want it.. but for the purpose of JS-visible RUM APIs, I think looking at aggregate data across many sessions is the right approach here; once you detect a an anomaly on the aggregate data, ping the browser in question to investigate deeper.

Fundamentally though, one still has to check a cache before the request goes out. This check has to happen before requestStart (because you must determine if you are making a request). There are other browser specific issues (eg chrome puts cache writes on the critical path of getting a response to the renderer) but those are too specific to standardize.

Well, it is privacy sensitive, but I agree.. possible to obtain today if you control the resource, or if the origin gives you the right CORS bits to read its headers. What would you do with this data?

This can help answer questions like "what would the impact be if I start changing resources more frequently" or "are people using browsers that tend to have a long cache history".

bmaurer avatar Nov 14 '16 22:11 bmaurer

Fundamentally though, one still has to check a cache before the request goes out. This check has to happen before requestStart (because you must determine if you are making a request). There are other browser specific issues (eg chrome puts cache writes on the critical path of getting a response to the renderer) but those are too specific to standardize.

/me nods.. and that time is already captured between startTime ~> requestStart ~> responseStart, right? The only special case today is the "served from memory cache" scenario which is unspecified.. but hopefully will once we explain it in Fetch.

Well, it is privacy sensitive, but I agree.. possible to obtain today if you control the resource, or if the origin gives you the right CORS bits to read its headers. What would you do with this data? This can help answer questions like "what would the impact be if I start changing resources more frequently" or "are people using browsers that tend to have a long cache history".

Can't you already infer this via *size attributes? E.g., if you start changing resources more frequently, one would expect to see a bump in cumulative transfer sizes vs. revalidations vs. cache hits.

igrigorik avatar Nov 15 '16 00:11 igrigorik