Unit test - more strict format string
Changes:
- hex digit character class
- more strict rfc4122 compliance for variant (see Wikipedia):
The variant covered by the UUID specification is indicated by the two most significant bits of N being 1 0 (i.e., the hexadecimal N will always be 8, 9, A, or B).
This depends on #6 to be merged in order to pass.
This happens to fix #2 by accident. Let me know if you'd like me to rebase.
Simple counter example for v4 uuids (generated from your uuid package):
c73d8f33-f7f8-4b84-626c-022ea05dd742
Code:
package main
import (
"github.com/nu7hatch/gouuid"
"fmt"
)
func main() {
uid, err := uuid.NewV4()
if err != nil {
panic(err)
}
fmt.Println(uid.String())
}
From Wikipedia:
Version 4 UUIDs have the form xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx where x is any hexadecimal digit and y is one of 8, 9, a, or b
As you can see, byte 8 in the result above is 6, which is not one of 8, 9, a or b. Wikipedia is consistent with the spec:
The following table lists the contents of the variant field, where
the letter "x" indicates a "don't-care" value.
Msb0 Msb1 Msb2 Description
1 0 x The variant specified in this document.
So, a compliant byte 8 looks like this (x is placeholder): 10xxxxxx
So, the top 4 bits would be one of:
1000: 81001: 91010: a1011: b
The existing code does this:
case ReservedRFC4122:
u[8] = (u[8] | ReservedRFC4122) & 0x7F
This will set the second msb and do a bitwise and on 0b1111111. This bitmask guarantees that the msb will never be set and the second msb will never be set. Effectively, the code will produce a byte 8 like: 01xxxxxx, not 10xxxxxx, so the top 4 bits will always be one of:
0100: 40101: 50110: 60111: 7
If you run the code a few times, you should be able to see the pattern.
I fixed the regex in the test file, and that may make things a bit more clear. The old regex was a bit too open in what it accepted.
If you want me to revise my PR to exclude kisielk's commits, I can do that. I wasn't sure if you abandoned this repo since there hasn't been a commit for a year or so, so I went ahead and pulled in his changes.
I suppose you could also fix the code by doing something like this (for RFC4122):
u[8] = (u[8] | 0x80) & 0xBF
I just did a literal translation of the spec, hence the relatively clumsy syntax. I'd be happy to throw together another PR if you want the change.