LLMZoo
LLMZoo copied to clipboard
Challenge on the training details of Phoenix
One word for all
As reported in the technical report, the bs: 256 is seemly a large-scale batch size with 2048 max sequence length. I wonder what hardware environment used for fine-tuning Phoenix-7B? In detailed, how many A100-80GB used?
Thanks!