ColossalAI
ColossalAI copied to clipboard
[BUG]: 使用 gemini,必须是2的幂的卡数,不然出现 assert chunk_size % self.pg_size == 0
🐛 Describe the bug
使用 gemini,必须是2的幂的卡数,不然出现 assert chunk_size % self.pg_size == 0
打印 chunk_size 是 40MB
Environment
多台 8x80G A100,使用最新的code
Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑🤝🧑👫🧑🏿🤝🧑🏻👩🏾🤝👨🏿👬🏿
Title: [BUG]: using gemini, the number of cards must be a power of 2, otherwise assert chunk_size % self.pg_size == 0 will appear
🐛 Describe the bug
When using gemini, the number of cards must be a power of 2, otherwise assert chunk_size % self.pg_size == 0 will appear
Print chunk_size is 40MB
Environment
Multiple 8x80G A100, using the latest code
@ver217 and @1SAA , can you take a look at this issue. I thought Gemini has implemented padding for chunks, so that 8 elements over 3 devices will be divided as (3, 3, 2) where 2 will be padded to 3.
I'll fix this soon.