alpa icon indicating copy to clipboard operation
alpa copied to clipboard

Check failed: operand_dim < ins->operand(0)->shape().rank() (2 vs. 2)Does not support this kind of Gather.

Open caixiiaoyang opened this issue 1 year ago • 2 comments

Please describe the bug Aborted (core dumped) Please describe the expected behavior I have two A100 GPUs, when I use alpa.PipeshardParallel(), the model runs fine, when I use alpa.ShardParallel(), I get a core dumped. This error occurs during the auto_sharding process. Check failed: operand_dim < ins->operand(0)->shape().rank() (2 vs. 2)Does not support this kind of Gather.I would like to know under what circumstances this error occurs. Can you provide some troubleshooting ideas and specific errors, as shown in the screenshot below? Screenshots image

Code snippet to reproduce the problem

Additional information Add any other context about the problem here or include any logs that would be helpful to diagnose the problem.

caixiiaoyang avatar Sep 28 '23 03:09 caixiiaoyang

I also met this issue when trying to use alpa.ShardParallel() or alpa.PipeshardParallel() to auto parallelize my llama model.

image

zigzagcai avatar Nov 22 '23 07:11 zigzagcai

I also met this issue when trying to use alpa.ShardParallel() or alpa.PipeshardParallel() to auto parallelize my llama model.

image

I also encountered this problem in the process of parallelizing llama. Is your problem solved?

caixiiaoyang avatar Dec 07 '23 01:12 caixiiaoyang