gouuid icon indicating copy to clipboard operation
gouuid copied to clipboard

Unit test - more strict format string

Open beatgammit opened this issue 12 years ago • 2 comments

Changes:

  • hex digit character class
  • more strict rfc4122 compliance for variant (see Wikipedia):

The variant covered by the UUID specification is indicated by the two most significant bits of N being 1 0 (i.e., the hexadecimal N will always be 8, 9, A, or B).

This depends on #6 to be merged in order to pass.

beatgammit avatar May 30 '13 00:05 beatgammit

This happens to fix #2 by accident. Let me know if you'd like me to rebase.

beatgammit avatar May 30 '13 00:05 beatgammit

Simple counter example for v4 uuids (generated from your uuid package):

c73d8f33-f7f8-4b84-626c-022ea05dd742

Code:

package main

import (
    "github.com/nu7hatch/gouuid"
    "fmt"
)

func main() {
    uid, err := uuid.NewV4()
    if err != nil {
        panic(err)
    }
    fmt.Println(uid.String())
}

From Wikipedia:

Version 4 UUIDs have the form xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx where x is any hexadecimal digit and y is one of 8, 9, a, or b

As you can see, byte 8 in the result above is 6, which is not one of 8, 9, a or b. Wikipedia is consistent with the spec:

The following table lists the contents of the variant field, where
the letter "x" indicates a "don't-care" value.

Msb0  Msb1  Msb2  Description
 1     0     x    The variant specified in this document.

So, a compliant byte 8 looks like this (x is placeholder): 10xxxxxx

So, the top 4 bits would be one of:

  • 1000: 8
  • 1001: 9
  • 1010: a
  • 1011: b

The existing code does this:

case ReservedRFC4122:
    u[8] = (u[8] | ReservedRFC4122) & 0x7F

This will set the second msb and do a bitwise and on 0b1111111. This bitmask guarantees that the msb will never be set and the second msb will never be set. Effectively, the code will produce a byte 8 like: 01xxxxxx, not 10xxxxxx, so the top 4 bits will always be one of:

  • 0100: 4
  • 0101: 5
  • 0110: 6
  • 0111: 7

If you run the code a few times, you should be able to see the pattern.

I fixed the regex in the test file, and that may make things a bit more clear. The old regex was a bit too open in what it accepted.

If you want me to revise my PR to exclude kisielk's commits, I can do that. I wasn't sure if you abandoned this repo since there hasn't been a commit for a year or so, so I went ahead and pulled in his changes.

I suppose you could also fix the code by doing something like this (for RFC4122):

u[8] = (u[8] | 0x80) & 0xBF

I just did a literal translation of the spec, hence the relatively clumsy syntax. I'd be happy to throw together another PR if you want the change.

beatgammit avatar May 31 '13 05:05 beatgammit