node-redis How does `executeIsolated` impact watched keys?

Description

In your main README, the Transactions (Multi/Exec) gives an example without calling .watch(). Below the example, there is a little blurb that you can first call .watch(), in which case, if any of the watched keys change while executing the transaction, Redis will roll it back. Digging a big further, this blurb also links to Isolated Execution Guide, which differs from the main README in that it calls .watch(). However, before calling .watch(), an isolatedClient is created with the executeIsolated method.

What exactly are the implications of using an isolatedClient? Do we need to use one of these if we are calling .watch() and if so, why? Will a WatchError be thrown if a different connection is used between .watch() and .exec()?

For some background, I'm trying to debug some WatchErrors in my project and am noticing that we aren't using an isolatedClient and am wondering if that is the cause. I don't see any logs (or code paths) that would indicate that the watched keys are being changed after being watched:

try {
  await this.client.watch(key);
  await this.client
    .multi()
    .hSet(key, some_value)
    .expire(key, some_expiration)
    .exec();
}
catch (e) {
  if (e instanceof WatchError) {
    // seeing this block get hit often
  }

  throw e;
}

Thanks for any help you can provide here!

Sep 15 '23 21:09 jedkass

Redis (the server) stores the WATCH state on the connection/client level, so if you use the client for concurrent tasks (for example, with an HTTP server to server concurrent HTTP requests), you'll need to make sure to use an "isolated client", so the "WATCH state" won't "leak" between concurrent requests. Hope that makes sense. :)

Sep 19 '23 21:09 leibale

Hmm it sort of makes sense.

We moved towards using executeIsolated, but we are still seeing transient WatchErrors. It's extremely strange since our logs show no competing writes for these keys. It's almost like the keys aren't being unwatched after previous watch/multi/exec calls, but my understanding is that exec should reset the watched keys and that I shouldn't call unwatch. Should I call unwatch?

We even have retries on the watch/multi/exec block that is transiently failing, but if it fails the first time, it fails for all the retries. So even if there was some competing call, I don't see how each retry would also fail. It seems more likely that the key's watch state was never cleared.

Sep 21 '23 16:09 jedkass

"but my understanding is that exec should reset the watched keys and that I shouldn't call unwatch" - AFAIK that's true.

import { createClient } from '@redis/client';

const client = await createClient()
  .on('error', err => console.error('client error', err))
  .connect();

// this should fail
{
  const [, value] = await Promise.all([
    client.WATCH('key'),
    client.GET('key'),
    client.SET('key', '1'),
  ]);
  try {
    await client.MULTI()
      .SET('key', (Number(value) + 1).toString())
      .EXEC();
  } catch (err) {
    // should throw WatchError
    console.error(err);
  }
}

// this should work
{
  const [, value] = await Promise.all([
    client.WATCH('key'),
    client.GET('key')
  ]);
  console.log(
    await client.MULTI()
      .SET('key', (Number(value) + 1).toString())
      .EXEC()
  );
}

await client.disconnect();

Can you please share some code?

Sep 21 '23 17:09 leibale

Here's a somewhat-cleaned version of what we are doing. And in case it's relevant, it looks like we're on [email protected]

Thanks for the fast response and for looking at this!

import { createClient, WatchError } from 'redis';
import type { RedisClientType } from 'redis';
import sleep from './aSleepImplementation';

export const buildRedisClient = (): RedisClientType =>
  createClient({
    socket: {
      port: REDIS_PORT,
      host: REDIS_HOST,
      reconnectStrategy: (retries: number) => {
        if (retries > MAX_CONNECTION_RETRIES) {
          return undefined;
        }
        return RETRY_WAIT_TIME;
      },
    },
    database: REDIS_DB,
  });

export class OurRedisClient {
  protected readonly client: RedisClientType;

  private connecting: boolean;

  constructor() {
    this.connecting = false;
    this.client = buildRedisClient();

    this.client.on('connect', () => {
      this.connecting = true;
    });

    this.client.on('ready', () => {
      this.connecting = false;
    });

    this.client.on('end', () => {
      this.connecting = false;
    });

    this.client.on('error', (err) => {
      this.connecting = false;
      // some error handling
    });

    this.client.on('reconnecting', () => {
      this.connecting = true;
    });
  }

  async connect() {
    return this.client.connect();
  }

  async quit(quitAttempts = 0) {
    if (this.connecting && quitAttempts < 5) {
      await sleep(500 * quitAttempts);
      await this.quit(quitAttempts + 1);
      return;
    }
    return this.client.quit();
  }
  
  async setSomeStuff(
    key: string,
    value: number,
    expiration: number 
  ) {
    for (let retryAttempts = 0; retryAttempts < 5; retryAttempts++) {
      try {
        await this.client.executeIsolated(async isolatedClient => {
          await isolatedClient.watch(key);
          const multi = isolatedClient.multi();

          await multi
            .hSet(key, value)
            .expire(key, expiration)
            .exec();
        });
        break;
      }
      catch (e) {
        if (e instanceof WatchError) {
          // This randomly gets hit. When it does, each subsequent retry hits this too
          if (retryAttempts + 1 === maxRetryAttempts) {
            throw e;
          }
          const backoff = Math.min(Math.pow(2, retryAttempts) * delay, maxDelay);
          const jitter = Math.random() * 200;
          await sleep(backoff + jitter);
        }
        else {
          throw e;
        }
      }
    }
  }
}

class ClassThatUsesOurRedisClient {
  redisClient: OurRedisClient;

  keys: string[];

  constructor(keys: string[]) {
    this.keys = keys;
  }

  async methodThatDoesSomeStuff() {
    this.redisClient = buildRedisClient();
    await this.redisClient.connect();

    const promises = this.keys.map((key) => {
      return this.redisClient.setSomeStuff(key, some_value, some_expiration);
    });
    await Promise.all(promises);
  }

  async aCleanupMethodCalledLater() {
    await this.redisClient.quit();
  }
}

Sep 21 '23 17:09 jedkass

@leibale Any ideas here by chance?

Oct 04 '23 19:10 jedkass

I actually had similar issue and it turned out that at one location I forgot to call multi() on the isolatedClient (was calling it on the main client). So indeed it looked like what you describe the "watch leaked" into the next call. I guess I was lucky and it happened every time.

Oct 10 '23 20:10 emovla

@emovla Thanks for the the response! I wish that was it but we're actually always calling multi() on the isolatedClient. My code snippet is a pretty good skeleton/redacted version of what we're doing.

Oct 10 '23 21:10 jedkass

@jedkass I'll try to see if I have time to debug it tomorrow..

Oct 10 '23 21:10 leibale

node-redis node-redis copied to clipboard

How does `executeIsolated` impact watched keys?

Description

node-redis
node-redis copied to clipboard