uuid icon indicating copy to clipboard operation
uuid copied to clipboard

Add support for little-endian byte order representation.

Open sunsingerus opened this issue 3 years ago • 7 comments

Although RFC4122 recommends network byte order for all fields, the PC industry (including the ACPI, UEFI, SMBIOS and Microsoft specifications) has consistently used little-endian byte encoding for the first three fields: time_low, time_mid, time_hi_and_version. Example from SMBIOS spec, section 7.2.1 System - UUID The UUID {00112233-4455-6677-8899-AABBCCDDEEFF} would thus be represented as: 33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.4.0.pdf

sunsingerus avatar Feb 06 '21 20:02 sunsingerus

So this means the version cannot be determined since they moved the version field. Sounds like Microsoft. This isn't the first standard that they have messed up for everyone else.

I am not sure how the unmarshaling will get called int this change. I could see a function that swaps between a compliant UUID and a non-compliant one. To make this work with any of encoding packages that use the Unmarshaler interface I think you would need a whole new type that modifies both the marshal/unmarshal methods as well as parsing bytes.

This will deserve some more thought and discussion. It would also deserve some tests :-)

pborman avatar Feb 06 '21 21:02 pborman

So this means the version cannot be determined since they moved the version field.

Yes, as far as I understand the situation, there is no way to determine what byte order is used in the UUID binary representation. One "must know in advance" what byte order is used for encoding.

I am not sure how the unmarshaling will get called int this change.

Let me explain the situation as I see it. The UUID with a string representation, say, "{00112233-4455-6677-8899-aabbccddeeff}" can be represented differently in wire/physical format. As bytes 00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff using big-ending encoding (as RFC4122 recommends). As bytes 33 22 11 00 55 44 77 66 88 99 aa bb cc dd ee ff using little-ending encoding (as specified in ACPI spec or SMBIOS spec or any other spec which honors little-endian encoding). We must know in advance, what byte order is used and call appropriate FromBytes() function.

Let's take a look on example. Imagine a device produced by a hardware vendor. Each vendor has its own UUID, embedded into produced device. Let's say we'd like to print "human-readable name" of the vendor. We have a buf with raw data fetched from the device. Let's say UUID is located in bytes buf[0x10:0x20] So we'd like to be able to do:

var vendorX = uuid.MustParse("00112233-4455-6677-8899-aabbccddeeff")
vendorUUID, err = uuid.FromBytes(buf[0x10:0x20])
if vendorUUID == vendorX {
  fmt.Println("This device is produced by Vendor X")
}

In our case UUID inside the buf is represented in little-endian endcoding , while uuid.FromBytes() expects big-endian encoding, and result of the uuid.FromBytes() would be UUID with string representation "{33221100-5544-7766-8899-aabbccddeeff}". So the check would fail due to bytes physical positioning.

Let's introduce another function, FromBytesLittleEndian() which will treat bytes as little-endian encoded.

var vendorX = uuid.MustParse("00112233-4455-6677-8899-aabbccddeeff")
vendorUUID, err = uuid.FromBytesLittleEndian(buf[0x10:0x20])
if vendorUUID == vendorX {
  fmt.Println("This device is produced by Vendor X")
}

and the whole piece of code will run correctly.

I could see a function that swaps between a compliant UUID and a non-compliant one.

Yes, I've introduced top-level function

func FromBytesLittleEndian(b []byte) (uuid UUID, err error)

which is insiped by

func FromBytes(b []byte) (uuid UUID, err error)

and appropriate marshaling functions:

func (uuid UUID) MarshalBinaryLittleEndian() ([]byte, error)
func (uuid *UUID) UnmarshalBinaryLittleEndian(data []byte) error

which are inspired by

func (uuid UUID) MarshalBinary() ([]byte, error)
func (uuid *UUID) UnmarshalBinary(data []byte) error

respectively. These new functions understand little-endian byte encoding and how to Marshal/Unmarshal binary data.

To make this work with any of encoding packages that use the Unmarshaler interface I think you would need a whole new type that modifies both the marshal/unmarshal methods as well as parsing bytes.

Can you, please, elaborate this item a little bit, I am not sure what exacly is meant here. Is it really so complicated? And how already exisiting functions, such as FromBytes(), MarshalBinary() and UnmarfshalBinary() fit into this picture?

Also I am not sure about whole new type, because it is still the same UUID with the same logic/idea. The only thing different is byte order in UUID's binary representation. Also we need to be able to compare UUIDs read from sources with different byte order to each other directly, such as this:

uuidX, err = uuid.FromBytes(bufX[0x10:0x20])
uuidY, err = uuid.FromBytesLittleEndian(bufY[0x10:0x20])
if uuidX == uuidY {
  fmt.Println("UUIDs are equal")
}

This will deserve some more thought and discussion.

Yes, sure, that's why I am opening the discussion with initial PR for little-endian encoding support. As an alternative approach, I can suggest to introduce optional parameter to FromBytes() function which will specify encoding, something like:

func FromBytes(b []byte, order ...ByteOrder) (uuid UUID, err error)

Insipired by encoding.binary package. If there is no order provided - assume big-endian encoding. Otherwise use explicitly specified encoding. What do you think?

It would also deserve some tests :-)

Yep :-)

sunsingerus avatar Feb 07 '21 18:02 sunsingerus

These really are two different types because they have different binary representations and they are not comparable to each other without converting from one to another. Also, encoders, such as encoding/json, use interfaces as described in encoding to know how to encode/decode values. Since there are interfaces they must have those exact names. The only way to get JSON to unmarshal an SMBIOS UUID (as opposed to regular UUID) is to have a new type. It does not need all the functions provided by the standard UUID package and can easily be written in terms of the regular uuid package:

package leuuid

import "github.com/google/uuid"

type LEUUID [16]byte

func FromUUID(u uuid.UUID) LEUUID {
        u[0], u[1], u[2], u[3] = u[3], u[2], u[1], u[0]
        u[4], u[5] = u[5], u[4]
        u[6], u[7] = u[7], u[6]
        return u 
}

func Parse(s string) (LEUUID, error) {
        u, err := uuid.Parse(s)
        return FromUUID(u), err
}

func (u LEUUID) UUID() uuid.UUID {
        return FromUUID(u)
}

func (u LEUUID) MarshalText() ([]btye, error) {
        return u.UUID().MarshalText()
}

...

pborman avatar Feb 08 '21 15:02 pborman

I wrote up a working example of what I suggested above, it is included below.

Would this concept work for you? In your structures who are serialized to the DSP0134 format you would use the dsp0134.UUID type rather than the standard uuid.UUID type.

// Package dsp0134 is a wrapper around github.com/google/uuid.  It supports the
// non-standard UUID format described in the DMTF System Management BIOS
// (SMBIOS) Reference Specification document DSP0134.
// https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.4.0.pdf
// The DSP0134 standard reorders the first 8 bytes of the UUID.  The standard
// RFC4122 encoding for the UUID "00112233-4455-6677-8899-AABBCCDDEEFF" is
//	00 11 22 33 44 55 66 77 88 99 AA BB CC DD EE FF
// The encoding specified in DSP0134 is:
//	33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF
package dsp0134

import "github.com/google/uuid"

// A UUID is a 128 bit (16 byte) Universal Unique IDentifier as defined by
// DSP0134.
//
// Most methods from github.com/google/uuid can be used by invoking the
// UUID method:
//
//	var u UUID
//	return u.UUID().Version()
type UUID [16]byte

// swapUUID returns u after converting it to/from RFC4122 from/to DSP0134
// ordering.
func swapUUID(u [16]byte) [16]byte {
        u[0], u[1], u[2], u[3] = u[3], u[2], u[1], u[0]
        u[4], u[5] = u[5], u[4]
        u[6], u[7] = u[7], u[6]
        return u
}

// swapInPlace is like swapUUID but swaps u in place.
func swapInPlace(u []byte) {
        u[0], u[1], u[2], u[3] = u[3], u[2], u[1], u[0]
        u[4], u[5] = u[5], u[4]
        u[6], u[7] = u[7], u[6]
}

// ToUUID return u as a uuid.UUID.
func ToUUID(u UUID) uuid.UUID {
        return swapUUID(u)
}

// FromUUID return u as a UUID.
func FromUUID(u uuid.UUID) UUID {
        return swapUUID(u)
}

// Parse is analgous to github.com/google/uuid.Parse.
func Parse(s string) (UUID, error) {
        u, err := uuid.Parse(s)
        return swapUUID(u), err
}

// FromBytes is analgous to github.com/google/uuid.FromBytes.
func FromBytes(b []byte) (UUID, error) {
        u, err := uuid.FromBytes(b)
        return swapUUID(u), err
}

// ParseBytes is analgous to github.com/google/uuid.ParseBytes.
func ParseBytes(b []byte) (UUID, error) {
        u, err := uuid.ParseBytes(b)
        return FromUUID(u), err
}

// UUID returns u as a github.com/google/uuid.UUID.
func (u UUID) UUID() uuid.UUID {
        return ToUUID(u)
}

// MarshalText implements encoding.TextMarshaler.
func (u UUID) MarshalText() ([]byte, error) {
        return uuid.UUID(swapUUID(u)).MarshalText()
}

// UnmarshalText implements encoding.TextUnmarshaler.
func (u *UUID) UnmarshalText(data []byte) error {
        if err := (*uuid.UUID)(u).UnmarshalText(data); err != nil {
		return err
	}
	swapInPlace(u[:])
	return nil
}

// MarshalBinary implements encoding.BinaryMarshaler
func (u UUID) MarshalBinary() ([]byte, error) {
        return uuid.UUID(u).MarshalBinary()
}

// UnmarshalBinary implements encoding.BinaryUnmarshaler
func (u *UUID) UnmarshalBinary(data []byte) error {
        return (*uuid.UUID)(u).UnmarshalBinary(data)
}

pborman avatar Feb 08 '21 21:02 pborman

Hey, I made and example package: github.com/pborman/dsp0134 Will this package work for you? If so I am thinking we can make it a sub package of the uuid package.

pborman avatar Feb 09 '21 16:02 pborman

Thanks a lot for the detailed explanation, I understand now that initial approach was incorrect. Little-endian UUID has to comply to interfaces from encoding package, thus it has to be a separate type. Package dsp0134 from the first glance looks good for me, however I need to try it

Will this package work for you?

Thank you very much for the package, I'll try tomorrow, sorry for the delay.

sunsingerus avatar Feb 09 '21 23:02 sunsingerus

Did this work? I am wondering if it is worth making into a supported package or not. I am thinking a sub package of this UUID package.

pborman avatar Feb 16 '21 00:02 pborman