omniperf icon indicating copy to clipboard operation
omniperf copied to clipboard

Add MI300 details to docs

Open peterjunpark opened this issue 1 year ago • 0 comments

This PR updates the documentation with info about the MI300 series

Performance model

L1

  • Update L1 cache line size to 128B for MI300 (https://advanced-micro-devices-demo--446.com.readthedocs.build/projects/rocprofiler-compute/en/446/conceptual/vector-l1-cache.html#l1-cache-line-size) (64B for MI200).

UTCL1

  • Update UTCL1 hit-on-miss note to specify MI200 because it doesn't apply to MI300 (https://advanced-micro-devices-demo--446.com.readthedocs.build/projects/rocprofiler-compute/en/446/conceptual/vector-l1-cache.html#l1-unified-translation-cache-utcl1).

L2

  • Update L2 cache line size to 128B for MI300 (https://advanced-micro-devices-demo--446.com.readthedocs.build/projects/rocprofiler-compute/en/446/conceptual/l2-cache.html#l2-cache-line-size) (128B for MI300 and MI200).

  • Update channel count in text for MI300 (https://advanced-micro-devices-demo--446.com.readthedocs.build/projects/rocprofiler-compute/en/446/conceptual/l2-cache.html#l2-cache-tcc)

  • Atomic requests (https://advanced-micro-devices-demo--446.com.readthedocs.build/projects/rocprofiler-compute/en/446/conceptual/l2-cache.html#request-flow)

  • Add 128B read request metric to table (https://advanced-micro-devices-demo--446.com.readthedocs.build/projects/rocprofiler-compute/en/446/conceptual/l2-cache.html#detailed-transaction-metrics)

VALU

= Add MI300 to list of products with MFMA units (https://advanced-micro-devices-demo--446.com.readthedocs.build/projects/rocprofiler-compute/en/446/conceptual/pipeline-descriptions.html#vector-arithmetic-logic-unit-valu)

AGPRs

  • Add MI300 to list here (https://advanced-micro-devices-demo--446.com.readthedocs.build/projects/rocprofiler-compute/en/446/conceptual/pipeline-descriptions.html#accumulation-vector-general-purpose-registers-agprs) (512 KB based on https://rocm.docs.amd.com/en/latest/reference/gpu-arch-specs.html)

Scalar / Instruction cache

  • 64KB / shared between 2CUs on MI300 (https://advanced-micro-devices-demo--446.com.readthedocs.build/projects/rocprofiler-compute/en/446/conceptual/shader-engine.html#scalar-l1-data-cache-sl1d)

peterjunpark avatar Oct 09 '24 17:10 peterjunpark