ringattention icon indicating copy to clipboard operation
ringattention copied to clipboard

Questions about the paper

Open hiroshinoji opened this issue 11 months ago • 2 comments

First, great work! I read the paper and had a few questions.

  • On p. 5, the paper says that minimal sequence length s = 6c, but where does this 6 come from? Is this related to 6bch for the blocks memory?
  • About the memory requirement, if I understand correctly, the total memory for 6 blocks might be 12bch (instead of 6bch) because each data is bfloat16?
  • Possibly, the interconnect bandwidth for TPUs might be wrong? According to https://cloud.google.com/blog/products/ai-machine-learning/introducing-cloud-tpu-v5p-and-ai-hypercomputer?hl=en (the table), ICI BW per chip is 2,400Gbps. My understanding is that this is the total of 6 links (to form 3D torus), so each link is 400Gbps or 50GB/s. Let me know if this interpretation is wrong.

hiroshinoji avatar Mar 13 '24 10:03 hiroshinoji