protobuf icon indicating copy to clipboard operation
protobuf copied to clipboard

testing/protocmp: cmp.EquateApproxTime equivalent for timestamppb.Timestamp

Open glenjamin opened this issue 3 years ago • 17 comments

Is your feature request related to a problem? Please describe. I've got a timestamppb.Timestamp field in my protobuf, and I'd like to compare it using something similar to https://pkg.go.dev/github.com/google/[email protected]/cmp/cmpopts#EquateApproxTime

Describe the solution you'd like Either an option to specify custom converters for protocmp types (so Timestamp would convert to time.Time), or a way to specify Approximate Time equality when applying the standard protocmp.Transform, or something else which achieves the same goal

Describe alternatives you've considered I've currently got a custom cmp.Option which applies a Filter when @type matches then a Transform, but it's quite awkward because I have to inspect the protocmp.Message map to find matching google.protobuf.Timestamp instances.

var cmpProtoWithTimes = cmp.Options{
	cmpopts.EquateApproxTime(time.Second),
	protocmp.Transform(),
	cmp.FilterPath(
		func(p cmp.Path) bool {
			if p.Last().Type() == reflect.TypeOf(protocmp.Message{}) {
				a, b := p.Last().Values()
				return msgIsTimestamp(a) && msgIsTimestamp(b)
			}
			return false
		},
		cmp.Transformer("timestamppb", func(t protocmp.Message) time.Time {
			return time.Unix(t["seconds"].(int64), int64(t["nanos"].(int32))).UTC()
		}),
	),
}

func msgIsTimestamp(x reflect.Value) bool {
	return x.Interface().(protocmp.Message).Descriptor().FullName() == "google.protobuf.Timestamp"
}

Additional context n/a

glenjamin avatar Aug 04 '21 20:08 glenjamin

Seems reasonable, but adding new API is a decision for the current maintainers of this module.

dsnet avatar Aug 04 '21 20:08 dsnet

Thanks - I'm happy to submit an appropriate PR if I get a rough steer on the direction and whether the maintainers would accept it.

glenjamin avatar Aug 04 '21 20:08 glenjamin

The general idea seems reasonable. I'm not sure what the right API is.

My first thought is that the best approach is one which allows easily transforming arbitrary message types (e.g., timestamppb.Timestamp) into other types (time.Time).

protocmp.MessageTransformer(func(p *timestamppb.Timestamp) time.Time {
  return p.AsTime()
})

But I have not thought through the details of what that would require.

neild avatar Aug 05 '21 00:08 neild

I think that sounds sensible - I expect it would need to be an option to Transform()? To be a standalone option it would need to transform the protocmp.Message back into the original type

glenjamin avatar Aug 05 '21 07:08 glenjamin

I think you're right and it would need to be an option to protocmp.Transform(), although @dsnet might know of an alternative approach that I'm not seeing.

neild avatar Aug 05 '21 17:08 neild

While we ensured flexibility of possibly adding options specifically for protocmp.Transform, I'm not sure if it will make it more or less confusing for user that some options are for cmp.Equal and others for protocmp.Transform.

Furthermore, the line between whether something should be a cmp.Option or an option for protocmp.Transform is somewhat blurred. For example, most of the existing options in protocmp could arguably have been an option on protocmp.Transform instead.

If we don't take the approach of making it be an option to protocmp.Transform, I imagine protocmp.MessageTransformer could be implemented as:

func [M proto.Message, V any] TransformMessage(f func(M) V) cmp.Option {
    var m M
    md := m.ProtoReflect().Descriptor()
    return cmp.FilterValues(func(mx, my Message) bool {
        return mx.Descriptor() == md && my.Descriptor() == md
    }, cmp.Transformer(m1 Message) V {
        var m2 M
        proto.Merge(m2, m1)
        return f(m2)
    })
}

Notes:

  • We write a filter to capture all protocmp.Message types that have a descriptor that matches the expected message type. Next, we write a transformer that converts the generic protocmp.Message into a concrete protobuf message type, which we then use to call the user provided f function. This isn't the most efficient as we will have effectively switch from a concrete type to protocmp.Message, back to a concrete type, and then to something else.
  • Since generics don't exist, the above would be implemented using Go reflection.
  • I'd call this TransformMessage since we seem to use verbs in protocmp.

With the API above, transforming timestamppb.Timestamp to a time.Time can be done with:

protocmp.TransformMessage((*timestamppb.Timestamp).AsTime)

It relies on using a lesser-known feature of Go called "method values".

dsnet avatar Aug 05 '21 18:08 dsnet

Actually, the above could probably slightly simplified as:

func [M proto.Message, V any] TransformMessage(f func(M) V) cmp.Option {
    var m M
    md := m.ProtoReflect().Descriptor()
    return FilterMessage(md, cmp.Transformer(m1 Message) V {
        var m2 M
        proto.Merge(m2, m1)
        return f(m2)
    })
}

by leveraging the existing protocmp.FilterMessage function.

dsnet avatar Aug 05 '21 18:08 dsnet

As an aside, we could probably add a protocmp.Message.Unwrap method that returns the original concrete message value. That would avoid a conversion.

EDIT(2021-08-06): Sent https://go-review.googlesource.com/c/protobuf/+/340489 to add protocmp.Message.Unwrap.

dsnet avatar Aug 05 '21 18:08 dsnet

I had a few minutes spare so I cobbled together something that basically works: https://github.com/glenjamin/protobuf-go/pull/1

I'll come back and add input type validation, expand the tests then submit via Gerrit when I get some time.

If someone wants to take over in the mean time then don't let me stop you.

glenjamin avatar Aug 08 '21 10:08 glenjamin

I'd be happy to review, but you'd need to submit it through Gerrit first.

If you're not aware, there's a contribution guide at: https://go.googlesource.com/protobuf/+/refs/heads/master/CONTRIBUTING.md

The protobuf module uses a very similar process as the Go project itself.

dsnet avatar Aug 08 '21 22:08 dsnet

Generics make this a bit more palatable to solve with a custom transformer. First, we create the transformer factory:

package prototest

// Transformer returns a cmp.Option that applies a transformation function that
// converts proto.Message of a certain type into another value.
func Transformer[M proto.Message, R any](name string, f func(M) R) cmp.Option {
	ft := reflect.TypeOf(f)
	msg := reflect.New(ft.In(0).Elem()).Interface().(proto.Message)
	pbName := msg.ProtoReflect().Descriptor().FullName()
	return cmp.FilterPath(
		func(p cmp.Path) bool {
			if p.Last().Type() != reflect.TypeOf(protocmp.Message{}) {
				return false
			}
			a, b := p.Last().Values()
			if a.Kind() == reflect.Invalid || b.Kind() == reflect.Invalid {
				return false
			}
			aMatch := a.Interface().(protocmp.Message).Descriptor().FullName() == pbName
			bMatch := b.Interface().(protocmp.Message).Descriptor().FullName() == pbName
			return aMatch && bMatch
		},
		cmp.Transformer(name, func(cmpMsg protocmp.Message) R {
			m := cmpMsg.Unwrap().(M)
			return f(m)
		}),
	)
}

My use-case was removing trailing zeros from a decimal number like 3.14000 to 3.14 but this technique also applies to EquateApproxTime.

// To use:
		prototest.Transformer("decimalpb_float64", func(d *decpb.Decimal) string {
			if d == nil || len(d.Value) == 0 {
				return ""
			}

			dotIdx := strings.LastIndexByte(d.Value, '.')
			if dotIdx == -1 || d.Value[len(d.Value)-1] != '0' {
				return d.Value
			}

			for i := len(d.Value) - 1; i >= dotIdx; i-- {
				if d.Value[i] != '0' {
					return d.Value[:i+1]
				}
			}
			return d.Value
		}),

One downside is the diff message starts to get pretty hairy:

    cmp_test.go:375: mismatch (-want +got)
          (*decimal.Decimal)(Inverse(protocmp.Transform, protocmp.Message(Inverse(decimalpb_float64, string(
        -       "12345.01",
        +       "12345.",
          )))))

jschaf avatar Nov 20 '22 08:11 jschaf

It seems weird here that you’re using both generics and periphrastic reflection:

func Transformer[M proto.Message, R any](name string, f func(M) R) cmp.Option {
	ft := reflect.TypeOf(f)
	msg := reflect.New(ft.In(0).Elem()).Interface().(proto.Message)
	pbName := msg.ProtoReflect().Descriptor().FullName()
	…
}

Instead you can just do this:

func Transform[M proto.Message, O any](f func(M) O) string {
	var msg M
	pbName := msg.ProtoReflect().Descriptor().FullName()
	…
}

And trimming zeros off a string has builtins:

func(d *decpb.Decimal) string {
	if d == nil || len(d.Value) == 0 {
		return ""
	}

	if strings.ContainsRune(d.Value, '.') {
		return strings.TrimRight(d.Value, "0")
	}

	return d.Value
}

puellanivis avatar Nov 20 '22 11:11 puellanivis

Instead you can just do this:

Neat, I forgot you can declare variables with the parameterized type. I also cleaned up the decimal zero code but I'll omit it here since it's not especially relevant to the issue. I'll post my revised code since it's perhaps useful but I think the bigger blocker is deciding how clients should invoke custom transforms, either through protocmp.Transform(...) or through a separate function call.

After some thought, I needed a comparer, not a transformer. Instead of trying to make a generic Transfomer, I made a filterPath wrapper that takes in a plain cmp.Option. I'm sure it's possible to create generic, type-safe Transformer, but I don't need that flexibility here.

// filterPath returns a cmp.Option that applies to every proto.Message matching
// fullName.
func filterPath(fullName protoreflect.FullName, opt cmp.Option) cmp.Option {
	return cmp.FilterPath(
		func(p cmp.Path) bool {
			if p.Last().Type() != reflect.TypeOf(protocmp.Message{}) {
				return false
			}
			a, b := p.Last().Values()
			if a.Kind() == reflect.Invalid || b.Kind() == reflect.Invalid {
				return false
			}
			aMatch := a.Interface().(protocmp.Message).Descriptor().FullName() == fullName
			bMatch := b.Interface().(protocmp.Message).Descriptor().FullName() == fullName
			return aMatch && bMatch
		},
		opt,
	)
}

// newComparer returns a cmp.Option that compares instances of the proto.Message
// M with f.
func newComparer[M proto.Message](f func(M, M) bool) cmp.Option {
	var msg M
	return filterPath(
		msg.ProtoReflect().Descriptor().FullName(),
		cmp.Comparer(func(a, b protocmp.Message) bool {
			return f(a.Unwrap().(M), b.Unwrap().(M))
		}))
}


func EquateDecimal(fraction, margin float64) cmp.Option {
	return newComparer(func(d1, d2 *decpb.Decimal) bool { /* omitted */})
}

jschaf avatar Nov 21 '22 00:11 jschaf

I finally got around to submitting this code via Gerrit - https://go-review.googlesource.com/c/protobuf/+/453215

glenjamin avatar Nov 23 '22 18:11 glenjamin

I wanted this feature too so opened a change before seeing the comment above https://github.com/golang/protobuf/issues/1347#issuecomment-1325501388.

https://go-review.googlesource.com/c/protobuf/+/453355

The implementation is pretty similar except some additional checks and tests, and using a descriptor fullname as a transformer name. I can rebase the changes after the one above gets merged. It looks like there's ongoing discussion in gerrit so I'll comment there too.

edit: fixed link

yuryu avatar Nov 25 '22 00:11 yuryu

⟨link to internal google code⟩

I don’t have a Google corp account anymore. 😆

puellanivis avatar Nov 25 '22 09:11 puellanivis

Oops, I didn't notice that. Sorry. It somehow replaced the domain automatically. https://go-review.googlesource.com/c/protobuf/+/453355 is the correct link (fixed the comment above too).

yuryu avatar Nov 25 '22 09:11 yuryu