kvrocks icon indicating copy to clipboard operation
kvrocks copied to clipboard

SCAN MATCH may incorrectly skip certain keys in some case

Open Genuineh opened this issue 6 months ago • 4 comments

Search before asking

  • [x] I had searched in the issues and found no similar issues.

Version

2.12.1

Minimal reproduce step

1.Run a kvrocks in cluster mode 2.Add test code with new folder like scan_bug under test/gocase/unit and run test

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied.  See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */

package scan_bug

import (
	"context"
	"fmt"
	"strings"
	"sync"
	"testing"

	"github.com/apache/kvrocks/tests/gocase/util"
	"github.com/redis/go-redis/v9"
	"github.com/stretchr/testify/require"
)

func TestScanBugWithNumberCursor(t *testing.T) {
	srv := util.StartServer(t, map[string]string{})
	defer srv.Close()
	ctx := context.Background()
	// rdb := srv.NewClient()
	rdb := redis.NewUniversalClient(&redis.UniversalOptions{
		Addrs: []string{
			yourhostport,
			yourhostport
		},
	})
	defer func() { require.NoError(t, rdb.Close()) }()
	require.NoError(t, rdb.ConfigSet(ctx, "redis-cursor-compatible", "yes").Err())
	ScanTest(t, rdb, ctx)
}

func ScanTest(t *testing.T, rdb redis.UniversalClient, ctx context.Context) {

	t.Run("SCAN MATCH BUG", func(t *testing.T) {
		require.NoError(t, rdb.FlushAll(ctx).Err())
		universalPopulate(t, rdb, "test:B:{1}:info", 3, 10)
		universalPopulate(t, rdb, "test:A:{1}:info", 3, 10)
		keys, _ := scanAllNodes(ctx, rdb, "test:B:*")
		keys2, _ := scanAllNodes(ctx, rdb, "test:*")
		require.Len(t, keys2, 6)
		require.Len(t, keys, 3)
	})
}

func scanAllNodes(ctx context.Context, rdb redis.UniversalClient, pattern string) ([]string, error) {
	if clusterClient, ok := rdb.(*redis.ClusterClient); ok {
		var keys []string
		var mu sync.Mutex
		var count int64 = 100

		// Scan keys in the cluster using ForEachMaster
		err := clusterClient.ForEachMaster(ctx, func(ctx context.Context, client *redis.Client) error {
			var cursor uint64
			for {
				var tempKeys []string
				var err error
				tempKeys, cursor, err = client.Scan(ctx, cursor, pattern, count).Result()
				if err != nil {
					return err
				}
				mu.Lock()
				for _, key := range tempKeys {
					fmt.Println("Key:", key)
					keys = append(keys, key)
				}
				mu.Unlock()
				if cursor == 0 {
					break
				}
			}
			return nil
		})
		if err != nil {
			panic(err)
		}

		return keys, nil
	}

	singleNodeClient, ok := rdb.(*redis.Client)
	if !ok {
		return nil, fmt.Errorf("NOT SUPPORT CLIENT TYPE")
	}

	return scanNodeSingle(ctx, singleNodeClient, pattern)
}

func scanNodeSingle(ctx context.Context, node *redis.Client, pattern string) ([]string, error) {
	var keys []string
	var cursor uint64
	var count int64 = 100

	for {
		var scanResult []string
		var err error
		scanResult, cursor, err = node.Scan(ctx, cursor, pattern, count).Result()
		if err != nil {
			return nil, fmt.Errorf("EXEC SCAN FAIL: %v", err)
		}

		keys = append(keys, scanResult...)

		if cursor == 0 {
			break
		}
	}

	return keys, nil
}

func universalPopulate(t testing.TB, rdb redis.UniversalClient, prefix string, n, size int) {
	ctx := context.Background()
	p := rdb.Pipeline()

	for i := 0; i < n; i++ {
		p.Do(ctx, "SET", fmt.Sprintf("%s%d", prefix, i), strings.Repeat("A", size))
	}

	_, err := p.Exec(ctx)
	require.NoError(t, err)
}

What did you expect to see?

the test pass

What did you see instead?

the test fail

Anything Else?

may be get a error cursor when the test running but pass in single mode

if (!cursor.empty()) {
    iter->Seek(ns_cursor);
    if (iter->Valid()) {
      iter->Next();
    }
  } else if (ns_prefix.empty()) {
    iter->SeekToFirst();
  } else {
    iter->Seek(ns_prefix);
  }

Are you willing to submit a PR?

  • [ ] I'm willing to submit a PR!

Genuineh avatar Jun 10 '25 08:06 Genuineh

github copilot used claude sonnet 4

基于对SCAN命令相关代码的分析,我现在来详细解释这个bug的原因和修复方案。

问题分析

1. 问题复现情况

从issue描述可以看出:

  • 在集群模式下运行测试时失败
  • 单机模式下测试通过
  • 测试代码中使用了SCAN MATCH来查找特定模式的键

2. 关键代码分析

让我查看数据库的SCAN实现:现在让我查看具体的Database::Scan实现代码:基于我对代码的分析,我现在可以为你详细分析Apache Kvrocks中issue #3023的SCAN MATCH bug问题:

问题分析

1. 问题描述

在集群模式下,使用SCAN MATCH可能会错误地跳过某些匹配的key。测试用例显示:

  • 写入6个key:test:B:{1}:info0-2test:A:{1}:info0-2
  • 使用 SCAN MATCH test:B:* 应该返回3个key,但实际可能返回不完整的结果
  • 使用 SCAN MATCH test:* 应该返回6个key

2. 根本原因分析

通过查看 src/storage/redis_db.cc 中的 Database::Scan 方法(第318-417行),发现问题出现在集群模式下的cursor处理逻辑:

问题代码段:

if (!cursor.empty()) {
    iter->Seek(ns_cursor);
    if (iter->Valid()) {
      iter->Next();  // 这里直接跳过了cursor位置的key
    }
  } else if (ns_prefix.empty()) {
    iter->SeekToFirst();
  } else {
    iter->Seek(ns_prefix);
  }

具体问题:

  1. Cursor定位后立即跳过:当cursor不为空时,代码先定位到cursor位置,然后立即调用 iter->Next() 跳过这个位置。这是为了避免重复返回上次扫描的最后一个key。

  2. 集群模式下的Slot切换:在集群模式下,当扫描到达slot边界时,代码会切换到下一个slot(第390-415行):

if (++slot_id >= HASH_SLOTS_SIZE) {
  break;
}
// ... 
ns_prefix = ComposeNamespaceKey(namespace_, "", false);
PutFixed16(&ns_prefix, slot_id);
ns_prefix.append(prefix);
iter->Seek(ns_prefix);  // 重新定位到新slot,但没有考虑cursor
  1. Cursor在Slot切换时失效:当从一个slot切换到另一个slot时,原来的cursor可能指向前一个slot的最后一个key,但新的slot开始扫描时没有正确处理这个cursor,可能导致跳过新slot中应该匹配的key。

3. 造成问题的场景

  1. 多Slot分布:在集群模式下,key分布在不同的hash slot中
  2. Pattern匹配:使用MATCH参数时,某些匹配的key可能恰好在slot边界附近
  3. Cursor传递:在slot切换过程中,cursor的传递和处理存在逻辑缺陷

4. 修复方案

方案1:改进cursor处理逻辑

// 在slot切换时,需要重新评估cursor的有效性
if (!cursor.empty()) {
    iter->Seek(ns_cursor);
    if (iter->Valid()) {
      // 检查当前key是否已经被处理过
      // 只有在确实需要跳过时才调用Next()
      std::tie(std::ignore, user_key) = ExtractNamespaceKey<std::string>(iter->key(), storage_->IsSlotIdEncoded());
      if (user_key == cursor) {
        iter->Next();
      }
    }
}

方案2:优化slot切换逻辑

// 在切换slot时,保持cursor状态的连续性
if (slot_id > slot_start) {
    // 当切换到新slot时,重置cursor状态
    ns_prefix = ComposeNamespaceKey(namespace_, "", false);
    PutFixed16(&ns_prefix, slot_id);
    ns_prefix.append(prefix);
    iter->Seek(ns_prefix);
    // 不要立即跳过,让正常的匹配逻辑处理
}

方案3:完整的修复(推荐) 需要重构cursor和slot切换的协调机制:

  1. 分离cursor逻辑:将cursor处理逻辑与slot切换逻辑分离
  2. 状态跟踪:在slot切换时维护扫描状态的连续性
  3. 边界处理:正确处理slot边界处的key扫描

5. 测试验证

建议的测试场景:

  1. 在集群模式下创建分布在不同slot的key
  2. 使用不同的MATCH pattern进行扫描
  3. 验证所有匹配的key都能被正确返回
  4. 特别关注slot边界附近的key

这个bug是一个典型的分布式系统中状态管理问题,需要仔细处理cursor在不同slot之间的传递和状态维护。基于对issue #3023的深入分析,我已经完成了对SCAN MATCH bug的技术分析并起草了修复方案的issue。

问题根本原因

这个bug的核心问题在于cluster模式下cursor处理逻辑不当,具体表现在:

  1. 不当的Next()调用:在src/storage/redis_db.cc的341-345行,当有cursor时无条件调用iter->Next(),这会跳过应该匹配的key
  2. slot切换时的cursor状态丢失:在390-415行的slot切换逻辑中,没有正确处理cursor状态,导致跨slot扫描时遗漏keys
  3. cursor验证缺失:缺少对cursor是否真正对应上次返回key的验证

可能的修复方案

  1. 改进cursor验证:只有当当前位置确实是cursor指向的key时才跳过
  2. 修复slot切换逻辑:确保cursor状态在slot间转换时得到正确维护
  3. 全面重构:分离cursor逻辑和slot切换逻辑,确保状态一致性

Genuineh avatar Jun 11 '25 01:06 Genuineh

@Genuineh Would you like to fix this?

git-hulk avatar Jun 11 '25 01:06 git-hulk

@git-hulk I can give it a try, but I'll first fix it in my own version before considering porting it to the official repository. This might take some time.

Genuineh avatar Jun 11 '25 01:06 Genuineh

@Genuineh Great, take your time.

git-hulk avatar Jun 11 '25 02:06 git-hulk

Hmmm it seems solved via #3146 ?

PragmaTwice avatar Aug 27 '25 12:08 PragmaTwice

Yes, it's the same issue.

git-hulk avatar Aug 27 '25 13:08 git-hulk

Closed. Feel free to reopen if it is not solved.

PragmaTwice avatar Aug 27 '25 13:08 PragmaTwice