got icon indicating copy to clipboard operation
got copied to clipboard

Fetching request from cache fails on certain websites

Open Aericio opened this issue 1 year ago • 6 comments

Describe the bug

  • Node.js version: 18.12.1
  • OS & version: Windows 10 @ 10.0.19044 Build 19044
  • Docker v4.14.0 on Redis v7.0.8

When sending a GET request to certain websites with caching enabled on got,

  • the first request will be successful, and the response is added to the cache.
  • subsequent requests are not successful, and instead causes node to freeze.

I noticed that this is not reproducible for all websites, but found that two websites, danbooru and tatsu, using Cloudflare that have their cf-cache-status set to DYNAMIC appear to consistently result in freezes after attempting subsequent requests.

Other websites that don't use CloudFlare, like GitHub on Varnish, or have cf-cache-status of HIT or REVALIDATED, like jsonplaceholder and oneskyapp, appear to be perfectly fine, with the response sent from cache.

Actual behavior

  • Send a GET request and saves response to cache.
  • A second request to something that is in cache will result in a freeze.

Expected behavior

  • Send a GET request and saves response to cache.
  • A second request should successfully return the cached response.

Code to reproduce

  • Start a Redis container using Docker -- https://hub.docker.com/_/redis
  • docker run --name some-redis -d -p 6379:6379 redis

See: https://github.com/Aericio/got-cache-issue

  • please look at the source code for extra comments.
  • yarn/npm install
  • yarn/npm run dev

You will notice that, on the first run of the code:

  • It will pass the first three tests.
  • It will run the first iteration of the fourth test, but hangs on the second iteration.
  • requesting is the final message, no still alive which should print after request has been completed.
What data is being logged?
{
    url: res.url,
    cache: res.isFromCache,
    "cf-cache-status": res.headers["cf-cache-status"], // cloudflare cache
    "x-cache": res.headers["x-cache"] // GitHub varnish cache
}
request log
requesting
{
  url: 'https://gist.githubusercontent.com/Aericio/3f20a5b4447b1ca6a7348f13987dc706/raw/d1a1001f17065a88acb71ff448de4a1a76db7077/data.json',
  cache: true,
  'cf-cache-status': undefined,
  'x-cache': 'HIT'
}
still alive
requesting
{
  url: 'https://gist.githubusercontent.com/Aericio/3f20a5b4447b1ca6a7348f13987dc706/raw/d1a1001f17065a88acb71ff448de4a1a76db7077/data.json',
  cache: true,
  'cf-cache-status': undefined,
  'x-cache': 'HIT'
}
still alive
requesting
{
  url: 'https://jsonplaceholder.typicode.com/todos/1',
  cache: true,
  'cf-cache-status': 'HIT',
  'x-cache': undefined
}
still alive
requesting
{
  url: 'https://jsonplaceholder.typicode.com/todos/1',
  cache: true,
  'cf-cache-status': 'HIT',
  'x-cache': undefined
}
still alive
requesting
{
  url: 'https://support.oneskyapp.com/hc/en-us/article_attachments/202761727/example_2.json',
  cache: true,
  'cf-cache-status': 'REVALIDATED',
  'x-cache': undefined
}
still alive
requesting
{
  url: 'https://support.oneskyapp.com/hc/en-us/article_attachments/202761727/example_2.json',
  cache: true,
  'cf-cache-status': 'REVALIDATED',
  'x-cache': undefined
}
still alive
requesting
{
  url: 'https://danbooru.donmai.us/posts.json',
  cache: false,
  'cf-cache-status': 'DYNAMIC',
  'x-cache': undefined
}
still alive
requesting
```</details>

<details>
<summary>redis keys</summary>
<br>

127.0.0.1:6379> flushall OK 127.0.0.1:6379> keys *

  1. "cacheable-request:GET:https://danbooru.donmai.us/posts.json"
  2. "namespace:cacheable-request"
  3. "cacheable-request:GET:https://support.oneskyapp.com/hc/en-us/article_attachments/202761727/example_2.json"
  4. "cacheable-request:GET:https://gist.githubusercontent.com/Aericio/3f20a5b4447b1ca6a7348f13987dc706/raw/d1a1001f17065a88acb71ff448de4a1a76db7077/data.json"
  5. "cacheable-request:GET:https://jsonplaceholder.typicode.com/todos/1"

- A potential cause could be that `cf-cache-status` is `DYNAMIC` here, compared to `HIT` or `REVALIDATED`.
- On the fifth test, I found another website with `cf-cache-status: DYNAMIC` header.
- After commenting out the fourth test so that I could get to the fifth test, I was met with the same behavior.
- First request goes through when cache doesn't exist, but hangs after the request has been added to the cache.

<details>
<summary>request log</summary>
<br>
```json5
... (omitted) 
requesting
{
  url: 'https://support.oneskyapp.com/hc/en-us/article_attachments/202761727/example_2.json',
  cache: true,
  'cf-cache-status': 'HIT',
  'x-cache': undefined
}
still alive
requesting
{
  url: 'https://tatsu.gg/lotties/tatsugotchi.json',
  cache: false,
  'cf-cache-status': 'DYNAMIC',
  'x-cache': undefined
}
still alive
requesting
redis keys
127.0.0.1:6379> flushall
OK
127.0.0.1:6379> keys *
1) "namespace:cacheable-request"
2) "cacheable-request:GET:https://tatsu.gg/lotties/tatsugotchi.json"
3) "cacheable-request:GET:https://support.oneskyapp.com/hc/en-us/article_attachments/202761727/example_2.json"
4) "cacheable-request:GET:https://gist.githubusercontent.com/Aericio/3f20a5b4447b1ca6a7348f13987dc706/raw/d1a1001f17065a88acb71ff448de4a1a76db7077/data.json"
5) "cacheable-request:GET:https://jsonplaceholder.typicode.com/todos/1"
  • This issue is also present when using new Map() as the cache.
request log + map keys
requesting
{                                                                                                                                           
  url: 'https://gist.githubusercontent.com/Aericio/3f20a5b4447b1ca6a7348f13987dc706/raw/d1a1001f17065a88acb71ff448de4a1a76db7077/data.json',
  cache: false,                                                                                                                             
  'cf-cache-status': undefined,                                                                                                             
  'x-cache': 'HIT'                                                                                                                          
}                                                                                                                                           
still alive                                                                                                                                                 
[Map Iterator] {                                                                                                                                            
  'cacheable-request:GET:https://gist.githubusercontent.com/Aericio/3f20a5b4447b1ca6a7348f13987dc706/raw/d1a1001f17065a88acb71ff448de4a1a76db7077/data.json'
}                                                                                                                                                           
requesting                                                                                                                                                  
{                                                                                                                                                           
  url: 'https://gist.githubusercontent.com/Aericio/3f20a5b4447b1ca6a7348f13987dc706/raw/d1a1001f17065a88acb71ff448de4a1a76db7077/data.json',                
  cache: true,                                                                                                                                              
  'cf-cache-status': undefined,                                                                                                                             
  'x-cache': 'HIT'                                                                                                                                          
}                                                                                                                                                           
still alive                                                                                                                                                 
[Map Iterator] {
  'cacheable-request:GET:https://gist.githubusercontent.com/Aericio/3f20a5b4447b1ca6a7348f13987dc706/raw/d1a1001f17065a88acb71ff448de4a1a76db7077/data.json'
}
requesting
{
  url: 'https://jsonplaceholder.typicode.com/todos/1',
  cache: false,
  'cf-cache-status': 'HIT',
  'x-cache': undefined
}
still alive
[Map Iterator] {
  'cacheable-request:GET:https://gist.githubusercontent.com/Aericio/3f20a5b4447b1ca6a7348f13987dc706/raw/d1a1001f17065a88acb71ff448de4a1a76db7077/data.json',
  'cacheable-request:GET:https://jsonplaceholder.typicode.com/todos/1'
}
requesting
{
  url: 'https://jsonplaceholder.typicode.com/todos/1',
  cache: true,
  'cf-cache-status': 'HIT',
  'x-cache': undefined
}
still alive
[Map Iterator] {
  'cacheable-request:GET:https://gist.githubusercontent.com/Aericio/3f20a5b4447b1ca6a7348f13987dc706/raw/d1a1001f17065a88acb71ff448de4a1a76db7077/data.json',
  'cacheable-request:GET:https://jsonplaceholder.typicode.com/todos/1'
}
requesting
{
  url: 'https://support.oneskyapp.com/hc/en-us/article_attachments/202761727/example_2.json',
  cache: false,
  'cf-cache-status': 'HIT',
  'x-cache': undefined
}
still alive
[Map Iterator] {
  'cacheable-request:GET:https://gist.githubusercontent.com/Aericio/3f20a5b4447b1ca6a7348f13987dc706/raw/d1a1001f17065a88acb71ff448de4a1a76db7077/data.json',
  'cacheable-request:GET:https://jsonplaceholder.typicode.com/todos/1',
  'cacheable-request:GET:https://support.oneskyapp.com/hc/en-us/article_attachments/202761727/example_2.json'
}
requesting
{
  url: 'https://support.oneskyapp.com/hc/en-us/article_attachments/202761727/example_2.json',
  cache: true,
  'cf-cache-status': 'HIT',
  'x-cache': undefined
}
still alive
[Map Iterator] {
  'cacheable-request:GET:https://gist.githubusercontent.com/Aericio/3f20a5b4447b1ca6a7348f13987dc706/raw/d1a1001f17065a88acb71ff448de4a1a76db7077/data.json',
  'cacheable-request:GET:https://jsonplaceholder.typicode.com/todos/1',
  'cacheable-request:GET:https://support.oneskyapp.com/hc/en-us/article_attachments/202761727/example_2.json'
}
requesting
{
  url: 'https://danbooru.donmai.us/posts.json',
  cache: false,
  'cf-cache-status': 'DYNAMIC',
  'x-cache': undefined
}
still alive
[Map Iterator] {
  'cacheable-request:GET:https://gist.githubusercontent.com/Aericio/3f20a5b4447b1ca6a7348f13987dc706/raw/d1a1001f17065a88acb71ff448de4a1a76db7077/data.json',
  'cacheable-request:GET:https://jsonplaceholder.typicode.com/todos/1',
  'cacheable-request:GET:https://support.oneskyapp.com/hc/en-us/article_attachments/202761727/example_2.json',
  'cacheable-request:GET:https://danbooru.donmai.us/posts.json'
}
requesting

Checklist

  • [x] I have read the documentation.
  • [x] I have tried my code with the latest version of Node.js and Got.

Aericio avatar Jan 30 '23 02:01 Aericio

cf-cache-status responses are noted here: https://developers.cloudflare.com/cache/about/default-cache-behavior/

HIT - The resource was found in Cloudflare’s cache. REVALIDATED - The resource is served from Cloudflare’s cache but is stale. The resource was revalidated by either an If-Modified-Since header or an If-None-Match header. DYNAMIC - Cloudflare does not consider the asset eligible to cache and your Cloudflare settings do not explicitly instruct Cloudflare to cache the asset. Instead, the asset was requested from the origin web server.

Is got doing something weird when CF is requesting from origin server instead of their edge?

Aericio avatar Jan 30 '23 02:01 Aericio

I'm having a similar issue but my remote does not use Cloudflare.

From some debugging, it seems that everything hangs in createCacheableRequest, more specifically: image

from what I see, when hangs happen, response.req is undefined, so response.complete is never set.

Any insights on how to debug this further?

marcoreni avatar May 05 '23 11:05 marcoreni

Hi @marcoreni and @Aericio - can you get me an example unit test or code snippet that fails with this so I can validate it and get this fixed?

jaredwray avatar Jun 07 '23 16:06 jaredwray

Hey @jaredwray, I've got a sample project that you can look at here: https://github.com/Aericio/got-cache-issue/.

Redis is not needed to reproduce this bug. You can uncomment the new Map() to switch.

Refer to the comments: the first three websites should work fine, the last two will freeze.

Aericio avatar Jun 10 '23 10:06 Aericio

Hi @Aericio and thanks for the example. I am going to be working on moving this case to unit tests in July and figuring out how to help on this. Will update here with progress in the next couple weeks.

jaredwray avatar Jun 23 '23 18:06 jaredwray

@Aericio - I have a friend who works on Keyv sometimes looking into this. Thanks for the test example!

jaredwray avatar Jul 15 '23 14:07 jaredwray