Paweł (roy) Rościszewski comments

Results 20 comments of


                                            Paweł (roy) Rościszewski

Missing 'GPU' entries in metrics

@Dubrzr could you please provide the output of the following command on your gpu2 server: ``` awk '{u=$2+$4; t=$2+$4+$5; if (NR==1){u1=u; t1=t;} else print ($2+$4-u1) * 100 / (t-t1); }'

Missing 'GPU' entries in metrics

Thanks... wrong intuition then... This is indeed the right endpoint. My sample output: ``` { "ai": { "CPU": { "CPU_ai": { "index": 0, "metrics": { "mem_free": { "unit": "MiB", "value":...

Missing 'GPU' entries in metrics

@Dubrzr: and how about this command: ``` nvidia-smi --query-gpu=name,fan.speed,utilization.gpu --format=csv,nounits ``` I see that you have a newer version of NVIDIA driver (the newest version that we've tested is 418.116),...

Missing 'GPU' entries in metrics

Everything looks fine here... Could you try modifying line 73 in tensorhive/core/managers/TensorHiveManager.py and set: ``` monitors = [] ``` and see if it helps?

Missing 'GPU' entries in metrics

@Dubrzr do you have any new observations or hints? If the data was lacking for gpu3, we would at least have an idea that the differing Fan speed "[N/A]" notation...

On-the-fly configuration from web app

If so, I'll leave it open with nice-to-have label. Maybe one day someone picks it up

Don't flush charts after refresh

I suppose that there should be a cache in backend and API parameter defining how many recent states should be returned...

Simple CLI for checking reservations on a node

If so, I'll leave it open with nice-to-have label. Maybe one day someone picks it up

Wrong GPU memory utilization info

Thanks for the report! TensorHive reads GPU utilization from nvidia-smi, so the inconsistency may be connected to polling frequency. TensorHive itself does not differentiate between processes in GPU utilization monitoring....

Wrong GPU memory utilization info

> tensorhive: > > > Average GPU utilization: 87% > > Average GPU memory utilization: 18% > > Start: Thursday, March 19th, 12:00 > > End: Friday, March 20th, 15:00...