hostboot icon indicating copy to clipboard operation
hostboot copied to clipboard

Add a DEVICE_LOCATOR attribute for more readable physical paths

Open wak-google opened this issue 8 years ago • 4 comments
trafficstars

On rhesus we added a DEVICE_LOCATOR xml attribute to optionally provide more descriptive location strings for components of the platform. It would be nice if we had a similar string in open-power hostboot where we could store this data.

<attribute>
    <id>DEVICE_LOCATOR</id>
    <description>Board-specific ID/locator name of a target</description>
    <simpleType>
        <string>
            <sizeInclNull>128</sizeInclNull>
        </string>
    </simpleType>
    <persistency>non-volatile</persistency>
    <readable/>
</attribute>

Examples PHYS_PATH to DEVICE_LOCATOR:

physical:sys-0/node-0/proc-1/mcs-5 -> /phys/rhesus/CPU0/murano/die1/mcs5
physical:sys-0/node-0/membuf-2 -> /phys/rhesus/HAMMOCK3_CPU0/hammock/centaur
physical:sys-0/node-0/membuf-2/mba-0 -> /phys/rhesus/HAMMOCK3_CPU0/hammock/centaur/mba0
physical:sys-0/node-0/dimm-21 -> /phys/rhesus/HAMMOCK3_CPU0/hammock/DIMM1

wak-google avatar Feb 16 '17 23:02 wak-google

"William A. Kennington III" [email protected] writes:

On rhesus we added a DEVICE_LOCATOR xml attribute to optionally provide more descriptive location strings for components of the platform. It would be nice if we had a similar string in open-power hostboot where we could store this data.

Is the end goal to have error messages that map back a specific part to a specific location?

We somewhat have that with location codes for some things (like PCI slots/devices) in OPAL/Linux too with entries in the device tree.

One thing we run into a bit is the device tree describing the devices in the system from a software perspective, along with a physical layout perspective that may be quite different.

I'd like to have a more cohesive approach to this for POWER9, so that we can give sensible error messages back relating to any part, and indeed have a standard approach across all OpenPOWER systems.

The reason I mention device-tree is that we'd want these available to OPAL and Linux too, including userspace.

-- Stewart Smith OPAL Architect, IBM.

ghost avatar Feb 17 '17 00:02 ghost

There are a couple of different aspects to this request to consider.

  1. Every vendor seems to have their own idea about how to identify parts. IBM has location codes, we have sensor/entity ids for IPMI-managed machines, Linux/Opal like devtree-ish strings. Things will get out of hand quickly if we try to satisfy everyone uniquely.

  2. At the nuts-and-bolts level I'm curious how the device locator string would get specified. Would we do it algorithmically based on existing data (plus a few new specific name attributes)? Would the system owner just specify it directly for every target?

  3. How do we even define what parts deserve a locator? Anything a user can physically remove (a FRU)? Anything with a unique presence detection? The example above mentions mba, but that isn't really a physical part either. Hostboot doesn't even know about a bunch of the FRUs on the system so we'd need to probably add new target types, another slippery slope...

I do agree that we're probably overdue to solve this inside the OP space though. The simplest solution would just be an attribute like you described on every target that is a free-form text for the system owner to fill in. @stewart-ibm - we might be able to squeeze the values into HDAT so OPAL gets them since there are location codes all over there. We need to be careful though since we need to work with other payloads/OSes too that might expect a traditional location-code...

dcrowell77 avatar Feb 17 '17 04:02 dcrowell77

Yes, in the past it was to give devices, particularly removable ones, an easier mapping to physical devices. We applied it to more than just physically removable devices, as any device entry was given a DEVICE_LOCATOR based on the physical device it was part of. For us this was meant as a free form field, but we did have an internally consistent naming scheme for the parts on rhesus. We did use this field from userspace to identify components in the device tree.

wak-google avatar Feb 23 '17 23:02 wak-google

This feels extremely similar to a set of late patches to P8 (missing from P9 at the moment)-- seems like we should definitely do something along these lines. This is the top level patch (there are others below it): https://github.com/open-power/hostboot/commit/3707a3c6562bb54f43099f15836b79ea13659aa7

sannerd avatar Mar 22 '17 20:03 sannerd