Bogus icon indicating copy to clipboard operation
Bogus copied to clipboard

How to register custom `DataSet`s for scoped/local seed propagation given by `UseSeed`

Open ChristoWolf opened this issue 2 years ago • 12 comments

Version Information

Software Version(s)
Bogus NuGet Package 33.1.1
.NET SDK 5.0.403
Windows OS Version 20H2
Visual Studio? Microsoft Visual Studio Enterprise 2022 (64-bit) Version 17.0.1

What locale are you using with Bogus?

en

What's the problem?

Hi! I am not really sure how to make a custom DataSet use a Faker<T> instance's configured scoped/local seed (configured via its UseSeed(int) option, added during this PR due to this issue) for generation. If I understand correctly, this is because its FakerHub (of course; or only if I am missing something) cannot (seed-) notify the custom DataSet. E.g. if we have a CustomDataSet (similarly to how it is suggested in this example), then using the this.Random for generating does not give seeded values, because of the randomizer never being set.

What possible solutions have you considered?

As I understand it, it should be possible for Faker derivates, because this.Notifier would be accessible, thus one could add

this.CustomDataSet = this.Notifier.Flow(new CustomDataSet());

in there. But this does not work for Faker<T>, because the Notifier of its FakerHub is not accessible.

I have also considered explicitly passing the Randomizer instance to the custom dataset, but that would disable the notifier mechanism there, if I understand correctly.

Do you have sample code to show what you're trying to do?

Not sure how to do this in a short amount of time in LinqPad. But simply using the Food dataset from a Faker<T> instance with configured constant local seed yields different results for each test/app execution because the randomizer is never explicitly set, thus a new Randomizer instance without the seed is created and attached to it.

ChristoWolf avatar Jan 11 '22 15:01 ChristoWolf

Hi @ChristoWolf, thank you for creating an issue.

  • Do you have a code sample that demonstrates what you are trying to ultimately achieve? A full code example would help.
  • Or code example that demonstrates the problem.

I have a hunch about what you are trying to do, but I can't be totally sure from the description above ... could you please post some sample code? I want to make sure I'm helping with the right answer to your question.

Also, sorry for the late reply, my workload is high right now and I don't get out of work until late :(

Thank you, Brian

bchavez avatar Jan 12 '22 04:01 bchavez

@ChristoWolf, I think this is what you're looking for; a way to reset the seed deterministically at the Faker<T> level with a custom data set. The basic idea is to execute ContextHelper.GetOrSet(faker, () => new Food()) as early as possible instead of at .RuleFor(__, f => f.Food()... ) rule invocation time.

The Faker<User>.AttachFood() method encapsulates this earlier invocation before .RuleFor(__, f => f.Food()... ) and the subsequent .UseSeed() flows the local seed randomizer to new Food() before .RuleFor invocations begin.

void Main()
{
   var f = new Faker<User>()
            .AttachFood()
            .UseSeed(1111)
            .RuleFor(x => x.Name, f => f.Name.FirstName())
            .RuleFor(x => x.Candy, f => f.Food().Candy());
            
            
   f.Generate().Dump();
   
   f.UseSeed(1111);
   
   f.Generate().Dump();
}

public class User
{
   public string Name;
   public string Candy;
}

public static class ExtensionsForFood
{
   public static Faker<T> AttachFood<T>(this Faker<T> fakerT) where T : class
   {
      var fakerInternal = fakerT as IFakerTInternal;
      var faker = fakerInternal.FakerHub;
      
      ContextHelper.GetOrSet(faker, () => new Food());
      
      return fakerT;
   }
   
   public static Food Food(this Faker faker)
   {
      return ContextHelper.GetOrSet(faker, () => new Food());
   }
}

public class Food : Bogus.DataSet
{
   private static readonly string[] Candies =
      {
            "Hard candy", "Taffy", "Chocolate bar", "Stick candy",
            "Jelly bean", "Mint", "Cotton candy", "Lollipop"
         };

   public string Candy()
   {
      return this.Random.ArrayElement(Candies);
   }

   private static readonly string[] Drinks = { "Soda", "Water", "Beer", "Wine", "Coffee", "Lemonade", "Milk" };
   public string Drink()
   {
      return this.Random.ArrayElement(Drinks);
   }
}

image

Let me know if that helps.

bchavez avatar Jan 12 '22 05:01 bchavez

Hi @bchavez!

First of all, thanks a ton for the incredibly fast feedback!

The solution that you provided is exactly what I was looking for, thanks! Now in hindsight it seems so obvious, this is great.

Would it make sense for you if this example was added to the Food dataset example and/or the README?

Thanks again!

ChristoWolf avatar Jan 12 '22 07:01 ChristoWolf

Yeah, I think that makes sense. We should probably add the extra code to the Food dataset example. I'll leave this ticket open until we get the work completed. Thanks again @ChristoWolf for writing up the issue.

bchavez avatar Jan 13 '22 03:01 bchavez

If you need help with that, I can also add this to the example, just let me know.

Thanks again for the great help and of course all the stuff that you are providing with Bogus, @bchavez!

ChristoWolf avatar Jan 13 '22 06:01 ChristoWolf

@bchavez: BTW, what I ended up doing on my side was to provide a generic extension method (generic w.r.t. T and the custom DataSet, based on your suggestion), so that I can simply call it always for any custom DataSet if needed.

Would it make sense to add such a (non-extension) generic method

public AttachCustomDataSet<TDataSet>() where TDataSet : DataSet, new()
   {
      // Generic implementation
   }

to Faker<T> as a new feature?

ChristoWolf avatar Jan 14 '22 10:01 ChristoWolf

Hi @ChristoWolf, that's a great way to generalize the solution.

Maybe, we could add it, but I would need to really think about it. Generally, I don't change the public APIs of Faker and Faker<T> very much but when a proposal like this is made, it does take some time for me to think about the situation.

I like to make sure our public APIs that 21M downloads see are generally consistent and well-thought-out before making any changes.

Usually, my thought process is:

  • Does it make sense for what we are trying to do? Or is it a hack for a deeper problem?
  • How many people are affected by an issue?
  • Is there any way it could be done better to make it easier on the user?
  • Is it easy for someone to use the API wrong? - And if they do, what would they expect to happen that isn't happening?

I just have to really think about it before moving on to a public API change. But I do like your generic solution for resolving the original issue you had :) :sunglasses:

bchavez avatar Jan 16 '22 02:01 bchavez

Great feature proposal thought process, makes a lot of sense of course!

Thanks again, and feel free to let me know if I can help, @bchavez!

ChristoWolf avatar Jan 16 '22 06:01 ChristoWolf

I just created my own dataset and discovered too that the seed is not respected in the dataset, even with that Attachfood approach provided.

public class ColorDataSet : DataSet
{
    public string Hex(bool includeAlpha = false)
    {
        var r = Random.Int(0, 255);
        var g = Random.Int(0, 255);
        var b = Random.Int(0, 255);

        var color = $"#{r:X2}{g:X2}{b:X2}";

        if (includeAlpha)
        {
            var a = Random.Int(0, 255);
            return color + r.ToString("X2");
        }

        return color;
    }
}

public static ColorDataSet Color(this Faker faker)
{
   return ContextHelper.GetOrSet(faker, () => new ColorDataSet());
}

public static Faker<T> AttachColor<T>(this Faker<T> faker) where T : class
{
   var fakerInternal = faker as IFakerTInternal;
   var hub = fakerInternal.FakerHub;

   ContextHelper.GetOrSet(hub, () => new ColorDataSet());

   return faker;
}

This results in teh hex color being generated with different results even when the Faker<T> is created with UseSeed. Did I miss something?

garcipat avatar Sep 19 '22 12:09 garcipat

@garcipat: Can you please show your code where you actually attach the data set?

ChristoWolf avatar Sep 19 '22 15:09 ChristoWolf

Oh my, I added the attach again and now it seams to work propertly...

I got it like that:

var faker = new Faker<T>()
                         .AttackColor()
                         .UseSeed(123456)
                         .RulesFor(x => x.Color, f => f.Color().Hex());

I'm not sure now if I maybe added the Attach after the UseSeed and becasue of that it didn't apply. I removed it in between since I had to present Bogus in my team and couldnt wait. but I'm happy I can use datasets now with that tequnique. It would be nice tho if that attach wouldnt be required in the first place.

Thank you for pointing to the probably error.

garcipat avatar Sep 22 '22 09:09 garcipat

Yep, the order is important, happy that it works for you now!

ChristoWolf avatar Sep 22 '22 15:09 ChristoWolf