gopter icon indicating copy to clipboard operation
gopter copied to clipboard

Shrinking interface confusion - how to write a custom shrinker?

Open ashb opened this issue 8 years ago • 8 comments

I'm just getting started with gopter (also being fairly new to property based testing in general) and I was trying to write a custom generator for a struct type of mine with private fields/initalizers.

A simplified example:


type Struct struct {
    name    string
    version string
}

func NewStruct(n, v string) *Struct {
    return &Struct{n, v}
}

func (s Struct) String() string {
    return s.name //+ ":" + s.version
}

func GenStruct() gopter.Gen {
    return gopter.CombineGens(
        gen.Identifier(),
        gen.NumString().SuchThat(nonZeroLen),
    ).Map(func(values []interface{}) Struct {
        name := values[0].(string)
        ver := values[1].(string)
        return NewStruct(name, ver)
    })
}

This works but as documented Map looses the shrinker from CombineGens. (Okay, slightly surprising but documented.)

My attempt at writing a version that adds a Shrinker ran into some problems - it seemed to loop and not shrink as well (or almost at all) as the CombineGen case. I was trying this:

func GenStructShrink() gopter.Gen {
    combines := gopter.CombineGens(
        gen.Identifier().WithLabel("name"),
        gen.NumString().SuchThat(nonZeroLen).WithLabel("ver"),
    )
    return func(genParams *gopter.GenParameters) *gopter.GenResult {
        input := combines(genParams)
        values, ok := input.Retrieve()

        var s *Struct
        if !ok {
            return gopter.NewEmptyResult(reflect.TypeOf(s))
        }
        mapper := func(strings []interface{}) *Struct {
            name := strings[0].(string)
            ver := strings[1].(string)
            return NewStruct(name, ver)
        }
        s = mapper(values.([]interface{}))

        myShrinker := func(value interface{}) gopter.Shrink {
            fmt.Printf("Creating chrinker for %#v\n", value)
            return input.Shrinker(values).Map(mapper)
        }

        return gopter.NewGenResult(s, myShrinker)
    }
}

I would have expected it to create the shrinker once and call it again, but it seems to create a new version of the shrinker many times which means it resets the shrink (because I'm ignoring the input value to try and get back at the underlying generators.

I guess this is a long way of saying: what is the exact interface of the Shrinkers? What is the Shrinker called with? When is it called? How often?

ashb avatar Jun 14 '16 10:06 ashb

Okay so slightly hacky (and i'm not sure that using the map as I have is either safe; or a good idea but this works and I get the same shrink result:

(This is the g.Map() with the addition of a custom Shrinker. I might be able to tidy this up to build on the existing Map function without all the duplication if this approach is sound)

func (g Gen) MapWithShrinker(f interface{}) Gen {
    mapperVal := reflect.ValueOf(f)
    mapperType := mapperVal.Type()

    if mapperVal.Kind() != reflect.Func {
        panic(fmt.Sprintf("Param of MapWithShrinker has to be a func, but is %v", mapperType.Kind()))
    }
    if mapperType.NumIn() != 1 {
        panic(fmt.Sprintf("Param of MapWithShrinker has to be a func with one param, but is %v", mapperType.NumIn()))
    } else {
        genResultType := g(gopter.DefaultGenParameters()).ResultType
        if !genResultType.AssignableTo(mapperType.In(0)) {
            panic(fmt.Sprintf("Param of Map WithShrinkerhas to be a func with one param assignable to %v, but is %v", genResultType, mapperType.In(0)))
        }
    }
    if mapperType.NumOut() != 1 {
        panic(fmt.Sprintf("Param of MapWithShrinker has to be a func with one return value, but is %v", mapperType.NumOut()))
    }
    outType := mapperType.Out(0)

    return func(genParams *GenParameters) *GenResult {
        history := make(map[interface{}]interface{})
        result := g(genParams)
        shrinker := func(value interface{}) gopter.Shrink {
            inputGens := history[value]

            s := result.Shrinker(inputGens).Filter(result.Sieve)
            return func() (interface{}, bool) {
                retrived, valid := s()
                if !valid {
                    return nil, false
                }
                new := mapperVal.Call([]reflect.Value{reflect.ValueOf(retrived)})[0].Interface()
                history[new] = retrived
                return new, true
            }
        }
        value, ok := result.RetrieveAsValue()
        if ok {
            mapped := mapperVal.Call([]reflect.Value{value})[0]
            history[mapped.Interface()] = value.Interface()
            return &GenResult{
                Shrinker:   shrinker,
                result:     mapped.Interface(),
                Labels:     result.Labels,
                ResultType: outType,
            }
        }
        return &GenResult{
            Shrinker:   NoShrinker,
            result:     nil,
            Labels:     result.Labels,
            ResultType: outType,
        }
    }
}

ashb avatar Jun 14 '16 12:06 ashb

Hi,

I don't think that this will really work in all cases.

Actually the main problem for mapping is that I distinguish between "Shrinker" and "Shrink". A Shrink is no more than an iterator of values and a Shrinker is a factory for Shrink's for a specific start-value. Mapping a Shrink can be done with the same mapping function as the mapping the generator (i.e. (string, string) -> struct in your case). Mapping the Shrinker though requires the reverse function struct -> (string, string) which might not be well-defined in all cases.

The reason why gen.Result contains a Srhinker instead of just a Shrink for the generated value (which would make mapping easier) is that prop.ForAll has to be able to start over a Shrink from intermediate points.

Simple example: Let's say you have a function that takes an integer parameter and fails for some values,. Internally the following happens:

  • The function is tested with random integer values. By chance it fails for lets say 1000000 (or even much bigger)
  • Now prop.ForAll tries to find the minimal value for the function to fail (which is the whole point of shrinking). Obviously it will take far too long to test all values linearly (i.e. with 1000000 function calls). So instead:
  • prop.ForAll uses the provided Shrinker with starting point 1000000 that creates a Srhink using nested intervals, i.e. creates a sequence of values: 0, 500000, 750000, 875000 ....
  • Let's say the function under test already fails for 500000, then prop.ForAll uses the Shrinker again with starting point 500000, which will create a sequence: 0, 250000, 375000, 437500, .... ... and so on ... this way one could find the (or a) minimal value with just about 20 function calls.

I hope this helps clarifying the problem a bit.

For your specific example. The best shrinker I could come up with looked like this:

func GenStruct() gopter.Gen {
    return gopter.CombineGens(
        gen.Identifier(),
        gen.NumString().SuchThat(nonZeroLen),
    ).Map(func(values []interface{}) *Struct {
        name := values[0].(string)
        ver := values[1].(string)
        return NewStruct(name, ver)
    }).WithShrinker(StructShrinker)
}

func StructShrinker(v interface{}) gopter.Shrink {
    a := v.(*Struct)
    shrinkName := gen.StringShrinker(a.name).Map(func(name string) *Struct {
        return NewStruct(name, a.version)
    })
    shrinkVersion := gen.StringShrinker(a.version).Filter(nonZeroLen).Map(func(version string) *Struct {
        return NewStruct(a.name, version)
    })
    return shrinkName.Interleave(shrinkVersion)
}

Unluckily I could not find a suitable example to demonstrate its usefullness

untoldwind avatar Jun 14 '16 16:06 untoldwind

Thanks - that's a very useful write up. I'll come back and read it a few more times and see if I can't come up with an addition to the docs.

I might have some more follow up questions too.

ashb avatar Jun 14 '16 19:06 ashb

FWIW, I'm also having trouble understanding the Shrink / Shrinker interface while trying to write a customer shrinker. Some docs and examples in this area would be really appreciated.

The comments above help somewhat. It would be great to get that information incorporated into the documentation somehow. I don't think my problems are resolved by this information, though. I have a shrinker which isn't as sophisticated as the int shrinker described above (fwiw I've previously looked at (and modified!) the implementation of the int shrinker but I never understood why it worked the way it works until I read the comment above).

My shrinker is similar to a slice shrinker but elements in the slice need to be adjusted to account for whatever element is removed. The shrinker makes a new slice and then replaces the original value with the modified value before returning it. This makes it easier for it to produce the next smallest value (as compared to re-shrinking the original by a greater amount on the next iteration). This seems to be a different approach than the built-in shrinkers take (they seem to keep the original forever and use extra smarts to jump straight to the next most-shrunk value during iteration). I could implement something more like them ... but I don't see why the approach I've taken fails (and the failure mode is really bizarre, my slice keeps getting bigger and bigger... as if Gopter is inserting new elements in between iterations?).

exarkun avatar Jun 24 '16 13:06 exarkun

FWIW, I tweaked my shrinker so it returns a copy of the internal state and this solved the problem.

exarkun avatar Jun 24 '16 14:06 exarkun

Yes, good point. Mutable structs are kind of a weakness in this case. First thing I did is to add a special warning in the docs. Next thing, I get your point(s): The current tooling for Shrinker's is a bit weak. Part of this weakness (and source of confusion) might be that they are part of the GenResult, while they are in fact entirely standalone constructs. I.e. a Shrinker (and its Shrink's) can and is supposed to operate without its generator.

I thought this over a bit and might have come up with a solution (yet to be fully implemented): The general idea is, to derive a complex generator using a bijective mapper (i.e. conversion in both directions). That way it should be possible to map the generator as well as its shrinker.

Roughly the interface would look like this:

DeriveGen(
  func(a int, b string) *MyStruct {
    return &MyStruct{a:a, b:b}
  },
  func(m *MyStruct) (int, string) {
    return m.a, m.b
  },
  gen.Integer(),
  gen.AnyString(),
)

(see the new branch for a bit more detail). Would something like this be helpful in your case(s)?

untoldwind avatar Jun 24 '16 18:06 untoldwind

I think it would be an improvement, yes!

Is the reverse operation needed? It's not possible to remember the generated values and go from there?

ashb avatar Jul 08 '16 09:07 ashb

As the interface of Shrinker and Shrink is right now, I fear the reverse operation is needed. The might be a way, if we consider a Shrink to be some kind of indexed sequence (i.e. were one can go back and forth and skip indices if necessary) ... but that would be a really major refactoring.

Anyhow the version in the branch should now work, I "just" have to complete the tests, then I'll merge it back to the master branch

untoldwind avatar Jul 09 '16 20:07 untoldwind