nano-vllm
nano-vllm copied to clipboard
RowParallelLayer with bias crash
If layer RowParallelLayer has bias, its weight_loader can crash. This is because in weight_loader it uses tp_dim to get shard_size, but bias is a 1d tensor and tp_dim = 1, so this will crash. We can use -1 to calculate shard_size.