caoyutingtjpu
caoyutingtjpu
日志如下 E0615 18:40:12.742552 22832 cluster_manager.cpp:1530] Manual failover user request accepted. E0615 18:40:12.805608 22857 cluster_manager.cpp:5005] Received replication offset for paused master manual failover: 57021795 57021795 E0615 18:40:12.833029 22858 cluster_manager.cpp:2194] All master...
## info all信息 ``` # Server redis_version:2.3.4-rocksdb-v5.13.4 redis_git_sha1:552a4365 redis_git_dirty:23 redis_build_id:4869811118804139172 redis_mode:cluster TENDIS_DEBUG:OFF os:Linux 3.10.0-514.21.1.el7.x86_64 x86_64 arch_bits:64 multiplexing_api:asio gcc_version:5:5:0 process_id:16293 tcp_port:10005 uptime_in_seconds:582532 uptime_in_days:6 config_file:/home/deploy/tendis/tendis_10005/tendis.conf # Clients connected_clients:23 # Memory used_memory:-1 used_memory_human:-1...
> 可以调小batch size,降低显存占用。 你指的batch size是传入的token数吗? 测试发现,batch_size在128,如果同时请求serving数过多,也会一直不释放,除非重启serving
> import torch > > torch.cuda.empty_cache() # 在每次推理结束后,释放显存 希望可以帮助到你。 谢谢,这个方式我们之前尝试过,随着运行时间的增加,显存占用非常高,当前是限制了总的使用显存大小