tink icon indicating copy to clipboard operation
tink copied to clipboard

Vault Regexp does not allow for dashes (or leading underscore).

Open OriPekelman opened this issue 3 years ago • 1 comments

Help us help you

Describe the bug Using Tink with Hashicorp Vault (hcvault) would fail if the vault host name contains dashes.

To Reproduce

Given

VAULT_ADDR= hcvault://cuddly-fish-61.example.com

Expected behavior

Client works

Error messages, stack traces, etc.

keyset.Handle: encrypted failed: malformed keyURL

The problem is at https://github.com/google/tink/blob/dca70f855d483d4350e222e2754db2f9e63c9ecb/go/integration/hcvault/hcvault_aead.go#L118

with:

var vaultKeyRegex = regexp.MustCompile(fmt.Sprintf("^%s/*([a-zA-Z0-9.:]+)/(.*)$", vaultPrefix))

Which is too restrictive and does not allow for dash correcting it to:

var vaultKeyRegex = regexp.MustCompile(fmt.Sprintf("^%s/*([a-zA-Z0-9.:-]+)/(.*)$", vaultPrefix))

Should solve the issue. And for extra points in case someone wants to use an RFC compliant host name with a leading underscore...

var vaultKeyRegex = regexp.MustCompile(fmt.Sprintf("^%s/ ?_*([a-zA-Z0-9.:-]+)/(.*)$", vaultPrefix))

Should cover the case.

Version information

  • Language: Go
  • Version: go1.16.6 linux/amd64
  • Environment: Linux

Additional comment

I am not sure it is a good idea to have a custom scheme here ... but at any rate it would probably be better to import net/url and use that for validation rather than to rely on an ad-hoc regex. I'll be happy to open a PR for either option (just correcting the Regex, or changing the implementation to use net/url.

OriPekelman avatar Jul 19 '21 14:07 OriPekelman

Patch:

--- a/go/integration/hcvault/hcvault_aead.go
+++ b/go/integration/hcvault/hcvault_aead.go
@@ -115,7 +115,7 @@ func (a *vaultAEAD) getDecryptionPath(keyURL string) (string, error) {
 	return strings.Join(parts, "/"), nil
 }
 
-var vaultKeyRegex = regexp.MustCompile(fmt.Sprintf("^%s/*([a-zA-Z0-9.:]+)/(.*)$", vaultPrefix))
+var vaultKeyRegex = regexp.MustCompile(fmt.Sprintf("^%s/*([a-zA-Z0-9.:-]+)/(.*)$", vaultPrefix))
 
 func (a *vaultAEAD) extractKey(keyURL string) (string, error) {
 	m := vaultKeyRegex.FindAllStringSubmatch(keyURL, -1)

theory avatar Sep 19 '22 21:09 theory

Took some time to implement three different options, including tests. Which one should we use? I'd be happy to submit a PR with the consensus choice.

import (
	"errors"
	"net/url"
	"regexp"
	"testing"
)

const (
	vaultScheme = "hcvault"
	vaultPrefix = vaultScheme + "://"
)

// extractKeyURI uses the URI package to extract the path.
func extractKeyURI(keyURL string) (string, error) {
	u, err := url.Parse(keyURL)
	if err != nil || u.Scheme != "hcvault" || len(u.Path) == 0 {
		return "", errors.New("malformed keyURL")
	}
	return u.EscapedPath()[1:], nil
}

// extractKeyRegex uses a comprehensive host regular expression to extract the
// path. Regex from https://www.regextester.com/23, as found in
// https://stackoverflow.com/a/56351327/79202.
var vaultURLRegex = regexp.MustCompile("^" + vaultPrefix + `(?:(?:(?:[a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9-]*[a-zA-Z0-9])\.)*(?:[A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9]))?/(.*)$`)

func extractKeyRegex(keyURL string) (string, error) {
	m := vaultURLRegex.FindAllStringSubmatch(keyURL, -1)
	if m == nil {
		return "", errors.New("malformed keyURL")
	}
	return m[0][1], nil
}

// extractKeyDumbRegex uses the simplest possible host regex to extract the key,
// based on the assumption that the host name is validated elsewhere (e.g., when
// the client actually makes a call to the vault service).
var vaultKeyDumbRegex = regexp.MustCompile("^" + vaultPrefix + `[^/]*/(.*)$`)

func extractKeyDumbRegex(keyURL string) (string, error) {
	m := vaultKeyDumbRegex.FindAllStringSubmatch(keyURL, -1)
	if m == nil {
		return "", errors.New("malformed keyURL")
	}
	return m[0][1], nil
}

func TestExtractKey(t *testing.T) {
	for _, tc := range []struct {
		desc string
		uri  string
		key  string
		err  string
	}{
		{
			desc: "simple",
			uri:  "hcvault://vault.example.com/hi",
			key:  "hi",
		},
		{
			desc: "path",
			uri:  "hcvault://vault.example.com/foo/bar/baz",
			key:  "foo/bar/baz",
		},
		{
			desc: "hyphen host",
			uri:  "hcvault://vault-prd.example.com/coyote",
			key:  "coyote",
		},
		{
			desc: "empty string",
			uri:  "hcvault://example.com/",
			key:  "",
		},
		{
			desc: "escaped",
			uri:  "hcvault://vault.example.com/this%2Band+that",
			key:  "this%2Band+that",
		},
		{
			desc: "no host",
			uri:  "hcvault:///hi",
			key:  "hi",
		},
		{
			desc: "http",
			uri:  "http://vault.com/hi",
			err:  "malformed keyURL",
		},
		{
			desc: "no path",
			uri:  "hcvault://vault.com",
			err:  "malformed keyURL",
		},
		{
			desc: "slash only",
			uri:  "hcvault://vault.com/",
			key:  "",
		},
		{
			desc: "",
			uri:  "hcvault://vault.com/",
			key:  "",
		},
	} {
		for name, code := range map[string]func(keyURL string) (string, error){
			"url parse":  extractKeyURI,
			"regex":      extractKeyRegex,
			"dumb regex": extractKeyDumbRegex,
		} {
			t.Run(name+"/"+tc.desc, func(t *testing.T) {
				key, err := code(tc.uri)
				if err == nil {
					if tc.err != "" {
						t.Fatalf("Missing error, want=%s", tc.err)
					}
				} else {
					if tc.err != err.Error() {
						t.Fatalf("Incorrect error, want=%s;got=%s", tc.err, err)
					}
				}
				if key != tc.key {
					t.Fatalf("Incorrect key, want=%s;got=%s", tc.key, key)
				}
			})
		}
	}
}

theory avatar Oct 07 '22 19:10 theory

I think it would be best to use "net/url" instead of a regex. We already use net/url in hcvault_client.go anyways.

juergw avatar Jan 31 '23 09:01 juergw

Could you update the pull request? Thanks.

juergw avatar Jan 31 '23 09:01 juergw

I submitted the regex check because a full URI parse seems redundant, since the URL will actually be used in an HTTP request and either be accepted by vault or not. You sure you want that instead? If so I will update the PR, yes.

theory avatar Jan 31 '23 16:01 theory

On top of being easier to read/maintain, using net/url seems to be a bit more performant. So we should definitely go with it.

$ go test -bench=.
goos: darwin
goarch: arm64
pkg: urlparsetest
BenchmarkExtractKeyURI-10          	 5756019	       188.7 ns/op
BenchmarkExtractKeyRegex-10        	 1845735	       649.7 ns/op
BenchmarkExtractKeyDumbRegex-10    	 2879150	       417.1 ns/op
PASS
ok  	urlparsetest	5.165s

This output is from adding the following benchmark functions alongside your code provided above:

const testURI = "hcvault://vault.example.com/hi"

func BenchmarkExtractKeyURI(b *testing.B) {
	for i := 0; i <= b.N; i++ {
		extractKeyURI(testURI)
	}
}

func BenchmarkExtractKeyRegex(b *testing.B) {
	for i := 0; i <= b.N; i++ {
		extractKeyRegex(testURI)
	}
}

func BenchmarkExtractKeyDumbRegex(b *testing.B) {
	for i := 0; i <= b.N; i++ {
		extractKeyDumbRegex(testURI)
	}
}

chuckx avatar Feb 01 '23 18:02 chuckx

Works for me, updated in 52515c2.

theory avatar Feb 07 '23 16:02 theory

PR submitted in #640.

theory avatar Feb 08 '23 00:02 theory