llm.c icon indicating copy to clipboard operation
llm.c copied to clipboard

Enhance gradient norm calc in gpt2_update: reuse variables, clarify first pass logic, improve condition handling

Open bgorlick opened this issue 8 months ago • 0 comments

The gradient norm calculation is improved by:

  • Reusing variables (ShardInfo tensor and ShardInfo shard) to reduce redundancy and enhance readability.
  • Introducing is_first_pass flag to clearly determine the first loop iteration.
  • Refining condition handling slightly

bgorlick avatar Jun 18 '24 01:06 bgorlick