tacc_stats
tacc_stats copied to clipboard
TACC Stats is an automated resource-usage monitoring and analysis package.
https://docs.nvidia.com/datacenter/dcgm/latest/user-guide/feature-overview.html ^^^ library
Find and implement library to collect data from AMD GPUs.
Find and implement library to collect data from Intel GPUs.
This includes: moving the install to use a system user no crontabs, only systemd better documentation about install process and requirements
Would like to find a framework that will do LDAP as well as have hooks for SSO for other sites.
For our systems, each node type has a maximum memory available limit. We should be able to plot this line on the Memory Usage plot where the value is dependent...