LLMs-from-scratch icon indicating copy to clipboard operation
LLMs-from-scratch copied to clipboard

Add Habana Gaudi (HPU) Support

Open BartoszBLL opened this issue 7 months ago • 2 comments

This pull request adds support for running inference on Habana Gaudi (HPU) processors by introducing a new directory dedicated to Gaudi-specific implementation. It includes setup instructions, scripts for downloading GPT-2 models, a Jupyter notebook for running inference, and necessary supporting files.

Changes Introduced

  • New directory: setup/05_accelerator_processors/01_habana_processing_unit/
  • Documentation:
    • README.md: Instructions for setting up and running GPT-2 inference on Habana Gaudi.
  • Notebook:
    • inference_on_gaudi.ipynb: Jupyter notebook demonstrating how to run inference on Gaudi, including performance comparisons against CPU.

Key Features

  • Provides setup instructions for installing necessary drivers and libraries.
  • Links to Habana documentation for further reading.
  • Implements inference workflow optimized for Habana Gaudi.
  • Includes performance monitoring tools for CPU vs. HPU comparisons.

Testing

  • Verified inference runs successfully on Gaudi HPU.

BartoszBLL avatar Mar 17 '25 22:03 BartoszBLL