node-exporter-textfile-collector-scripts
node-exporter-textfile-collector-scripts copied to clipboard
Add NVME device collector
In comparison to the existing nvme_metrics.sh collector, this one:
- Adds
firmware,serial,modelandsector_sizelabels. - Adds useful TYPE and HELP messages for every series.
- Exposes nearly all data returned by
nvme smartlogas series. - Runs in fraction of the time.
I needed more data from nvme_metrics.sh and ended up rewriting it.
Redacted example output:
# HELP nvme_temperature_celsius Composite temperature of the controller and namespaces associated with controller.
# TYPE nvme_temperature_celsius gauge
nvme_temperature_celsius{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 32
nvme_temperature_celsius{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 32
nvme_temperature_celsius{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 32
nvme_temperature_celsius{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 31
nvme_temperature_celsius{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 32
nvme_temperature_celsius{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 30
nvme_temperature_celsius{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 34
nvme_temperature_celsius{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 42
nvme_temperature_celsius{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 32
nvme_temperature_celsius{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 32
# HELP nvme_critical_warning_value Controller state warnings - each bit is a critical warning type and multiple bits may be set.
# TYPE nvme_critical_warning_value gauge
nvme_critical_warning_value{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_warning_value{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_warning_value{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_warning_value{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_warning_value{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_warning_value{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_warning_value{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_warning_value{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_warning_value{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_warning_value{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
# HELP nvme_avail_spare_percent Contains a normalized percentage (0.0 to 1.0) of the remaining spare capacity available.
# TYPE nvme_avail_spare_percent gauge
nvme_avail_spare_percent{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1.0
nvme_avail_spare_percent{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1.0
nvme_avail_spare_percent{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1.0
nvme_avail_spare_percent{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1.0
nvme_avail_spare_percent{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1.0
nvme_avail_spare_percent{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1.0
nvme_avail_spare_percent{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1.0
nvme_avail_spare_percent{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1.0
nvme_avail_spare_percent{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1.0
nvme_avail_spare_percent{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1.0
# HELP nvme_spare_thresh_percent When avail_spare_percent falls below this threshold, an asynchronous event completion may occur.
# TYPE nvme_spare_thresh_percent gauge
nvme_spare_thresh_percent{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0.1
nvme_spare_thresh_percent{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0.1
nvme_spare_thresh_percent{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0.1
nvme_spare_thresh_percent{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0.1
nvme_spare_thresh_percent{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0.1
nvme_spare_thresh_percent{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0.1
nvme_spare_thresh_percent{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0.1
nvme_spare_thresh_percent{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0.1
nvme_spare_thresh_percent{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0.1
nvme_spare_thresh_percent{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0.1
# HELP nvme_percent_used Vendor specific estimate of the drive's life, based on the actual usage and the vendor's prediction. A value of 100 indicates that the estimated endurance the drive has been consumed, but may not indicate a failure. The value is allowed to exceed 100. Percentages greater than 254 shall be represented as 255. This value shall be updated once per power-on hour (when the controller is not in a sleep state.
# TYPE nvme_percent_used gauge
nvme_percent_used{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_percent_used{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_percent_used{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_percent_used{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_percent_used{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_percent_used{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_percent_used{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_percent_used{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_percent_used{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_percent_used{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
# HELP nvme_data_units_read Number of 512 byte data units the host has read from the controller; this value does not include metadata. This value is reported in thousands (i.e. a value of 1 corresponds to 1000 units of 512 bytes read) and is rounded up. When the LBA size is a value other than 512 bytes, the controller shall convert the amount of data read to 512 byte units.
# TYPE nvme_data_units_read gauge
nvme_data_units_read{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1242904
nvme_data_units_read{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1252920
nvme_data_units_read{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1234340
nvme_data_units_read{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1227895
nvme_data_units_read{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1233909
nvme_data_units_read{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1222966
nvme_data_units_read{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1247890
nvme_data_units_read{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 571324
nvme_data_units_read{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1226829
nvme_data_units_read{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1223495
# HELP nvme_data_units_written Number of 512 byte data units the host has written to the controller; this value does not include metadata. This value is reported in thousands (i.e. a value of 1 corresponds to 1000 units of 512 bytes written) and is rounded up. When the LBA size is a value other than 512 bytes, the controller shall convert the amount of data written to 512 byte units.
# TYPE nvme_data_units_written gauge
nvme_data_units_written{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1870289
nvme_data_units_written{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1880142
nvme_data_units_written{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1860283
nvme_data_units_written{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1854802
nvme_data_units_written{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1851538
nvme_data_units_written{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1850677
nvme_data_units_written{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1875069
nvme_data_units_written{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 698592
nvme_data_units_written{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1853931
nvme_data_units_written{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 1849712
# HELP nvme_host_read_commands Number of read commands completed by the controller.
# TYPE nvme_host_read_commands gauge
nvme_host_read_commands{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 145283568
nvme_host_read_commands{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 146520682
nvme_host_read_commands{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 144227407
nvme_host_read_commands{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 143388170
nvme_host_read_commands{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 144134520
nvme_host_read_commands{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 142784296
nvme_host_read_commands{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 145895288
nvme_host_read_commands{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 65895804
nvme_host_read_commands{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 143277589
nvme_host_read_commands{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 142860648
# HELP nvme_host_write_commands N
# TYPE nvme_host_write_commands u
nvme_host_write_commands{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 145901528
nvme_host_write_commands{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 147139532
nvme_host_write_commands{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 144830383
nvme_host_write_commands{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 143984454
nvme_host_write_commands{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 144697530
nvme_host_write_commands{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 143401689
nvme_host_write_commands{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 146505881
nvme_host_write_commands{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 66023594
nvme_host_write_commands{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 143883237
nvme_host_write_commands{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 143457722
# HELP nvme_controller_busy_time_minutes Time in minutes the controller is busy with I/O commands. The controller is busy when there is a command outstanding to an I/O queue.
# TYPE nvme_controller_busy_time_minutes gauge
nvme_controller_busy_time_minutes{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 54
nvme_controller_busy_time_minutes{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 55
nvme_controller_busy_time_minutes{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 54
nvme_controller_busy_time_minutes{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 54
nvme_controller_busy_time_minutes{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 54
nvme_controller_busy_time_minutes{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 54
nvme_controller_busy_time_minutes{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 55
nvme_controller_busy_time_minutes{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 59
nvme_controller_busy_time_minutes{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 54
nvme_controller_busy_time_minutes{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 54
# HELP nvme_power_cycles_count Number of power cycles.
# TYPE nvme_power_cycles_count counter
nvme_power_cycles_count{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 8
nvme_power_cycles_count{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 8
nvme_power_cycles_count{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 8
nvme_power_cycles_count{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 8
nvme_power_cycles_count{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 8
nvme_power_cycles_count{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 9
nvme_power_cycles_count{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 9
nvme_power_cycles_count{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 8
nvme_power_cycles_count{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 8
nvme_power_cycles_count{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 8
# HELP nvme_power_on_hours Number of power-on hours. This may not include time that the controller was powered and in a non-operational power state.
# TYPE nvme_power_on_hours counter
nvme_power_on_hours{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 96
nvme_power_on_hours{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 96
nvme_power_on_hours{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 96
nvme_power_on_hours{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 96
nvme_power_on_hours{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 96
nvme_power_on_hours{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 96
nvme_power_on_hours{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 96
nvme_power_on_hours{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 96
nvme_power_on_hours{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 96
nvme_power_on_hours{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 96
# HELP nvme_unsafe_shutdowns Number of unsafe shutdowns. This count is incremented when a shutdown notification (CC.SHN) is not received prior to loss of power.
# TYPE nvme_unsafe_shutdowns counter
nvme_unsafe_shutdowns{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 4
nvme_unsafe_shutdowns{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 4
nvme_unsafe_shutdowns{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 4
nvme_unsafe_shutdowns{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 4
nvme_unsafe_shutdowns{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 4
nvme_unsafe_shutdowns{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 4
nvme_unsafe_shutdowns{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 5
nvme_unsafe_shutdowns{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 4
nvme_unsafe_shutdowns{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 4
nvme_unsafe_shutdowns{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 4
# HELP nvme_media_errors Number of occurrences where the controller detected an unrecoverable data integrity error. Errors such as uncorrectable ECC, CRC checksum failure, or LBA tag mismatch are included in this field.",
# TYPE nvme_media_errors counter
nvme_media_errors{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_media_errors{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_media_errors{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_media_errors{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_media_errors{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_media_errors{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_media_errors{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_media_errors{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_media_errors{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_media_errors{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
# HELP nvme_num_err_log_entries Number of Error Information log entries over the life of the controller
# TYPE nvme_num_err_log_entries counter
nvme_num_err_log_entries{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3
nvme_num_err_log_entries{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3
nvme_num_err_log_entries{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3
nvme_num_err_log_entries{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3
nvme_num_err_log_entries{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3
nvme_num_err_log_entries{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3
nvme_num_err_log_entries{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3
nvme_num_err_log_entries{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_num_err_log_entries{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3
nvme_num_err_log_entries{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 4
# HELP nvme_warning_temp_time Time in minutes that the controller is operational and the temperature_celsius field is greater than or equal to the Warning Composite Temperature Threshold (WCTEMP) field and less than the Critical Composite Temperature Threshold (CCTEMP) field.
# TYPE nvme_warning_temp_time gauge
nvme_warning_temp_time{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_warning_temp_time{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_warning_temp_time{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_warning_temp_time{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_warning_temp_time{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_warning_temp_time{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_warning_temp_time{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_warning_temp_time{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_warning_temp_time{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_warning_temp_time{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
# HELP nvme_critical_comp_time Time in minutes that the controller is operational and the temperature_celsius field is greater the Critical Composite Temperature Threshold (CCTEMP) field.
# TYPE nvme_critical_comp_time gauge
nvme_critical_comp_time{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_comp_time{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_comp_time{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_comp_time{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_comp_time{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_comp_time{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_comp_time{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_comp_time{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_comp_time{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
nvme_critical_comp_time{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 0
# HELP nvme_physical_size_bytes Drive size in bytes
# TYPE nvme_physical_size_bytes gauge
nvme_physical_size_bytes{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3840755982336
nvme_physical_size_bytes{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3840755982336
nvme_physical_size_bytes{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3840755982336
nvme_physical_size_bytes{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3840755982336
nvme_physical_size_bytes{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3840755982336
nvme_physical_size_bytes{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3840755982336
nvme_physical_size_bytes{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3840755982336
nvme_physical_size_bytes{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 256060514304
nvme_physical_size_bytes{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3840755982336
nvme_physical_size_bytes{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 3840755982336
# HELP nvme_used_bytes Used space in bytes
# TYPE nvme_used_bytes gauge
nvme_used_bytes{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 855803842560
nvme_used_bytes{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 859996622848
nvme_used_bytes{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 851925848064
nvme_used_bytes{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 849301798912
nvme_used_bytes{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 847833223168
nvme_used_bytes{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 847884361728
nvme_used_bytes{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 857825640448
nvme_used_bytes{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 256060514304
nvme_used_bytes{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 848857726976
nvme_used_bytes{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 847196110848
# HELP nvme_maximum_lba_count Maximum number of Logical Block Units
# TYPE nvme_maximum_lba_count gauge
nvme_maximum_lba_count{device="/dev/nvme0n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 7501476528
nvme_maximum_lba_count{device="/dev/nvme1n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 7501476528
nvme_maximum_lba_count{device="/dev/nvme2n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 7501476528
nvme_maximum_lba_count{device="/dev/nvme3n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 7501476528
nvme_maximum_lba_count{device="/dev/nvme4n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 7501476528
nvme_maximum_lba_count{device="/dev/nvme5n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 7501476528
nvme_maximum_lba_count{device="/dev/nvme6n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 7501476528
nvme_maximum_lba_count{device="/dev/nvme7n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 500118192
nvme_maximum_lba_count{device="/dev/nvme8n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 7501476528
nvme_maximum_lba_count{device="/dev/nvme9n1",firmware="XXX",serial="XXX",model="XXX",sector_size="512"} 7501476528
@gvalkov Thank you. That's a nice collector. I created a similar script based on another nvme exporter. I have added some more normalized metrics (e.g. the individual warning extracted from the critical_warnings bit field) https://github.com/teamapps-org/ansible-collection-teamapps-general/blob/main/roles/nvme_metrics/files/nvme_metrics.py
Your coding style is much better. Nice solution for renaming and normalizing with the lambda.
I also designed a dashboard for my metrics:
https://github.com/teamapps-org/ansible-collection-teamapps-general/blob/main/roles/grafana/files/dashboards/default/nvme.json

firmware and sector_size should not be labels, because they can change over time. Better put them into an _info metric.
Recently the node_exporter gained a node_nvme_info metric that provides the serial, model, and firmware revision: https://github.com/prometheus/node_exporter/issues/1891
@gvalkov Are you still interested in having this new collector merged? If so, please respond the outstanding questions in this PR.
Thank you for you contribution, however this script contains a few shortcomings.
- We strongly prefer that Python scripts use the official Prometheus client_python library, rather than implement their own metrics print functionality.
- Including all the device info labels in every metric as you have done is an anti-pattern. If just one of those labels changes (e.g. after a firmware upgrade), this will cause massive unnecessary churn in the database.
- Including large amounts of text from the NVMe specification as help text is probably not going to help the average user
- Many of the metrics violate the Prometheus naming best practices.
Closing in favour of #155.