[CRASH] Valkey7/8/9 crashes when unloading a module if a command from that module was added to an ACL user.
Crash report
84190:M 03 Nov 2025 09:24:54.458 * Module helloworld unloaded
=== REDIS BUG REPORT START: Cut & paste starting from here ===
84190:M 03 Nov 2025 09:24:54.458 # === ASSERTION FAILED ===
84190:M 03 Nov 2025 09:24:54.458 # ==> acl.c:672 'res == C_OK' is not true
------ STACK TRACE ------
Backtrace:
0 valkey-server 0x000000010646d20a ACLRecomputeCommandBitsFromCommandRulesAllUsers + 618
1 valkey-server 0x000000010644a2d1 moduleUnload + 897
2 valkey-server 0x000000010644bf87 moduleCommand + 615
3 valkey-server 0x00000001063645ba call + 362
4 valkey-server 0x000000010636654c processCommand + 4524
5 valkey-server 0x0000000106387a30 processCommandAndResetClient + 64
6 valkey-server 0x0000000106387d44 processInputBuffer + 516
7 valkey-server 0x000000010637ddd1 readQueryFromClient + 1185
8 valkey-server 0x000000010647701e callHandler + 46
9 valkey-server 0x00000001064764cd connSocketEventHandler + 365
10 valkey-server 0x000000010635300a aeProcessEvents + 554
11 valkey-server 0x00000001063537df aeMain + 63
12 valkey-server 0x000000010636f485 main + 3397
13 dyld 0x00007ff80f6e1530 start + 3056
------ INFO OUTPUT ------
# Server
redis_version:7.2.4
server_name:valkey
valkey_version:7.2.11
redis_git_sha1:97b6663c
redis_git_dirty:0
redis_build_id:81bab35838f643ec
redis_mode:standalone
os:Darwin 24.6.0 x86_64
arch_bits:64
monotonic_clock:POSIX clock_gettime
multiplexing_api:kqueue
atomicvar_api:c11-builtin
gcc_version:4.2.1
process_id:84190
process_supervised:no
run_id:422222010c33a4f130475c90742f6b879c771625
tcp_port:6379
server_time_usec:1762190694458110
uptime_in_seconds:82
uptime_in_days:0
hz:10
configured_hz:10
lru_clock:583014
executable:/Users/dpolyako/github/valkey-io/valkey7/src/valkey-server
config_file:
io_threads_active:0
listener0:name=tcp,bind=*,bind=-::*,port=6379
# Clients
connected_clients:1
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:16
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
total_blocking_keys:0
total_blocking_keys_on_nokey:0
# Memory
used_memory:5411424
used_memory_human:5.16M
used_memory_rss:8048640
used_memory_rss_human:7.68M
used_memory_peak:5612592
used_memory_peak_human:5.35M
used_memory_peak_perc:96.42%
used_memory_overhead:1151552
used_memory_startup:1149216
used_memory_dataset:4259872
used_memory_dataset_perc:99.95%
allocator_allocated:5401824
allocator_active:8015872
allocator_resident:8015872
total_system_memory:34359738368
total_system_memory_human:32.00G
used_memory_lua:32768
used_memory_vm_eval:32768
used_memory_lua_human:32.00K
used_memory_scripts_eval:0
number_of_cached_scripts:0
number_of_functions:0
number_of_libraries:0
used_memory_vm_functions:33792
used_memory_vm_total:66560
used_memory_vm_total_human:65.00K
used_memory_functions:216
used_memory_scripts:216
used_memory_scripts_human:216B
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.48
allocator_frag_bytes:2614048
allocator_rss_ratio:1.00
allocator_rss_bytes:0
rss_overhead_ratio:1.00
rss_overhead_bytes:32768
mem_fragmentation_ratio:1.49
mem_fragmentation_bytes:2646816
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_total_replication_buffers:0
mem_clients_slaves:0
mem_clients_normal:1824
mem_cluster_links:0
mem_aof_buffer:0
mem_allocator:libc
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0
# Persistence
loading:0
async_loading:0
current_cow_peak:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1762190612
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:-1
rdb_current_bgsave_time_sec:-1
rdb_saves:0
rdb_last_cow_size:0
rdb_last_load_keys_expired:0
rdb_last_load_keys_loaded:5
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_rewrites:0
aof_rewrites_consecutive_failures:0
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0
# Stats
total_connections_received:1
total_commands_processed:2
instantaneous_ops_per_sec:0
total_net_input_bytes:131
total_net_output_bytes:206460
total_net_repl_input_bytes:0
total_net_repl_output_bytes:0
instantaneous_input_kbps:0.00
instantaneous_output_kbps:0.00
instantaneous_input_repl_kbps:0.00
instantaneous_output_repl_kbps:0.00
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:0
expired_stale_perc:0.00
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:1
evicted_keys:0
evicted_clients:0
total_eviction_exceeded_time:0
current_eviction_exceeded_time:0
keyspace_hits:0
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
pubsubshard_channels:0
latest_fork_usec:0
total_forks:0
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
total_active_defrag_time:0
current_active_defrag_time:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:0
dump_payload_sanitizations:0
total_reads_processed:3
total_writes_processed:5
io_threaded_reads_processed:0
io_threaded_writes_processed:0
reply_buffer_shrinks:1
reply_buffer_expands:0
eventloop_cycles:821
eventloop_duration_sum:119226
eventloop_duration_cmd_sum:1917
instantaneous_eventloop_cycles_per_sec:9
instantaneous_eventloop_duration_usec:150
acl_access_denied_auth:0
acl_access_denied_cmd:0
acl_access_denied_key:0
acl_access_denied_channel:0
# Replication
role:master
connected_slaves:0
master_failover_state:no-failover
master_replid:abd2951b0471beaf5b55b6d68bf4a0ffd5b8fd13
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
# CPU
used_cpu_sys:0.092975
used_cpu_user:0.049648
used_cpu_sys_children:0.000000
used_cpu_user_children:0.000000
# Modules
# Commandstats
cmdstat_command|docs:calls=1,usec=1878,usec_per_call=1878.00,rejected_calls=0,failed_calls=0
cmdstat_acl|setuser:calls=1,usec=39,usec_per_call=39.00,rejected_calls=0,failed_calls=0
# Errorstats
# Latencystats
latency_percentiles_usec_command|docs:p50=1884.159,p99=1884.159,p99.9=1884.159
latency_percentiles_usec_acl|setuser:p50=39.167,p99=39.167,p99.9=39.167
# Cluster
cluster_enabled:0
# Keyspace
db0:keys=5,expires=0,avg_ttl=0
------ CLIENT LIST OUTPUT ------
id=4 addr=127.0.0.1:59466 laddr=127.0.0.1:6379 fd=8 name= age=23 idle=0 flags=N db=0 sub=0 psub=0 ssub=0 multi=-1 qbuf=45 qbuf-free=16845 argv-mem=22 multi-mem=0 rbs=1024 rbp=0 obl=0 oll=0 omem=0 tot-mem=18750 events=r cmd=module|unload user=default redir=-1 resp=2 lib-name= lib-ver=
------ CURRENT CLIENT INFO ------
id=4 addr=127.0.0.1:59466 laddr=127.0.0.1:6379 fd=8 name= age=23 idle=0 flags=N db=0 sub=0 psub=0 ssub=0 multi=-1 qbuf=45 qbuf-free=16845 argv-mem=22 multi-mem=0 rbs=1024 rbp=0 obl=0 oll=0 omem=0 tot-mem=18750 events=r cmd=module|unload user=default redir=-1 resp=2 lib-name= lib-ver=
argc: '3'
argv[0]: '"module"'
argv[1]: '"unload"'
argv[2]: '"helloworld"'
------ EXECUTING CLIENT INFO ------
id=4 addr=127.0.0.1:59466 laddr=127.0.0.1:6379 fd=8 name= age=23 idle=0 flags=N db=0 sub=0 psub=0 ssub=0 multi=-1 qbuf=45 qbuf-free=16845 argv-mem=22 multi-mem=0 rbs=1024 rbp=0 obl=0 oll=0 omem=0 tot-mem=18750 events=r cmd=module|unload user=default redir=-1 resp=2 lib-name= lib-ver=
argc: '3'
argv[0]: '"module"'
argv[1]: '"unload"'
argv[2]: '"helloworld"'
------ MODULES INFO OUTPUT ------
------ CONFIG DEBUG OUTPUT ------
sanitize-dump-payload no
lazyfree-lazy-eviction no
list-compress-depth 0
repl-diskless-load disabled
lazyfree-lazy-user-del no
lazyfree-lazy-user-flush no
io-threads-do-reads no
slave-read-only yes
io-threads 1
proto-max-bulk-len 512mb
activedefrag no
lazyfree-lazy-expire no
replica-read-only yes
client-query-buffer-limit 1gb
lazyfree-lazy-server-del no
repl-diskless-sync yes
=== REDIS BUG REPORT END. Make sure to include from START to END. ===
Additional information
compile sample modules:
cd src/modules/
make
run valkey with a sample module:
src/valkey-server --loadmodule src/modules/helloworld.so --enable-module-command yes
connect via valkey-cli:
acl setuser default +hello.simple
module unload helloworld
Code crashes in acl.c, ACLRecomputeCommandBitsFromCommandRulesAllUsers
int res = ACLSetSelector(selector, argv[i], sdslen(argv[i]));
serverAssert(res == C_OK);
reproed on Mac and Linux
@roshkhatri - thank you for your time helping investigate this
So we're trying to apply argv[i] that is "+hello.simple", right?
Maybe in this case valid behaviour should be that we try to apply, and if we can't then we skip it, this will be possible with replacement of serverAssert(res == C_OK); after int res = ACLSetSelector(selector, argv[i], sdslen(argv[i])); with something like:
if (res != C_OK) {
serverLog(LL_WARNING,
"Skipping invalid ACL rule '%s' for user '%s': command not found",
argv[i], u->name);
}
This would work but would have some issues, as we are re-computing the command bits from the selector->command_rules,
So IMO we can do one of following things:
- Cleanup the
selector->command_rulesfrommoduleUnregisterCleanupwith all the module categories, [sub]commands, because:- the
command_ruleswill have all the stale commands rules from all the unloaded modules. - Also, if you reload the same module, the users who previously had access, will again have access to the module silently, which doesn't seem safe.
- the
- Or it will be the users/admins responsibility to remove the permissions from all the users for these module commands before unloading the module
@dmitrypol does it also crash if we do the following order?
make
run valkey with a sample module:
src/valkey-server --loadmodule src/modules/helloworld.so --enable-module-command yes
connect via valkey-cli:
acl setuser default +hello.simple
acl setuser default -hello.simple
module unload helloworld
Cleanup the selector->command_rules from moduleUnregisterCleanup
Seems to be the most thorough solution to the problem
@roshkhatri it does crash with "+hello.simple" and then "-hello.simple", "Applying -hello.simple" is debug print I added.
7451:M 04 Nov 2025 19:40:35.938 * Module 'helloworld' loaded from src/modules/helloworld.so
7451:M 04 Nov 2025 19:40:35.938 * Server initialized
7451:M 04 Nov 2025 19:40:35.938 * Ready to accept connections tcp
7451:M 04 Nov 2025 19:41:16.097 * Module helloworld unloaded
Applying -hello.simple
=== VALKEY BUG REPORT START: Cut & paste starting from here ===
7451:M 04 Nov 2025 19:41:16.097 # === ASSERTION FAILED ===
7451:M 04 Nov 2025 19:41:16.097 # ==> acl.c:708 'res == C_OK' is not true
------ STACK TRACE ------
Backtrace:
0 valkey-server 0x0000000104e62c24 ACLRecomputeCommandBitsFromCommandRulesAllUsers + 492
1 valkey-server 0x0000000104e46668 moduleUnload + 748
2 valkey-server 0x0000000104e47a98 moduleCommand + 220
3 valkey-server 0x0000000104d44168 call + 428
4 valkey-server 0x0000000104d462d0 processCommand + 3384
5 valkey-server 0x0000000104d615ec processInputBuffer + 712
6 valkey-server 0x0000000104d6085c readQueryFromClient + 160
7 valkey-server 0x0000000104e6da30 connSocketEventHandler + 180
8 valkey-server 0x0000000104d2a6e4 aeProcessEvents + 356
9 valkey-server 0x0000000104d54c18 main + 26880
10 dyld 0x0000000182d4eb98 start + 6076
------ STACK TRACE DONE ------
@roshkhatri - yes, it will crash with acl setuser default -hello.simple. As long as ACL contains a module command with either + or - it will crash.
Or it will be the users/admins responsibility to remove the permissions from all the users for these module commands before unloading the module
I would rather the system validate that no user has an ACL that includes any added commands or categories (I assume it would also crash on the categories?)
Yes It would not crash on categories as we dont unregister the categories for on these, so the solution that would resolve all would need to cleanup the command rules while unloading the modules.
PR for fix - https://github.com/valkey-io/valkey/pull/2923
cc: @roshkhatri