Pratyush Patel
Pratyush Patel
The `HTTP_MCR` and `HTTP_RT` metrics appear to have the exact same values, but for different indices. Here's an example for `MSRTQps_0.csv` using Python `pandas`. Data file used: ``` >>> import...
There are two CPU utilization numbers in the microservices traces: 1. node |columns | Example Entry | | ---------- | :-----------: | | timestamp | 1000 | | nodeid |...
**Describe the bug** Inference using the 20B pretrained model from README with [slim weights](https://the-eye.eu/public/AI/models/GPT-NeoX-20B/slim_weights/) and 20B.yml config runs out of memory on 8xA100 40GB GPUs. I tried varying `pipe-parallel-size` and...
Hello, I'm trying to benchmark inference performance of various LLMs using MII. I load models using: ```Python import mii mii_configs = {"tensor_parallel": 2, "dtype": "fp16", "max_tokens": 1500, "load_with_sys_mem": True} mii.deploy(task="text-generation",...
I'm attempting to measure NVSwitch power usage using DCGM on a DGX-A100 machine: ``` ❯ dcgmi group -l +-------------------+----------------------------------------------------------+ | GROUPS | | 2 groups found. | +===================+==========================================================+ | Groups...