GenFu
GenFu copied to clipboard
Distinct
It would nice to be able to tell GenFu to give me a list with a certain property being distinct for the whole collection. Something along the following:
class Foo
{
int ID { get; set;}
string Name { get; set; } // Is unique in the datastore
}
A.Configure<Foo>()
.Fill(f => f.Name)
.Distinct();
Agreed this would be an extremely useful feature
I have tried working with MoreLinq in the meantime, but I am still getting exceptions in my integration tests. You guys have a gitter/slack somewhere I can chat for help?
Nothing yet but we are working on getting something setup
I kind of wonder if maybe the default behaviour should be to generate without duplicates and have an override to force a duplicate to appear. In most cases I would think not having duplicates would be desirable or at very least not harmful.
A few things to consider if we do it by default:
- Performance: Could be slow for large collections
- What do we do if someone asks for 10,000 items and we only have 1000 unique names in our database
- What should be unique? Consider the usual Person example. FirstName and LastName individual do not need to be unique but maybe should be unique when combined. Even FirstName + LastName should not be unique in any large data set.
Huh, those are good things to consider.
- We might be able to figure out some sort of a solution with a hashtable for constant time lookups but it would be difficult for entities which don't implement icomparable.
- Throw an exceptions would be the most sensible thing.
NotEnoughJunkToFillTheRequestedTrunkException
- I was thinking just for individual fields but multiple fields does make more sense.
If we are to throw exceptions I think that turning on distinct should be an option (not a default value/setting). This can be done from defaults or fill. Otherwise people's existing code base will start throwing exceptions.
Regards,
Garry Taylor
On 26 Dec 2015, at 23:18, Simon Timms <[email protected]mailto:[email protected]> wrote:
Huh, those are good things to consider.
- We might be able to figure out some sort of a solution with a hashtable for constant time lookups but it would be difficult for entities which don't implement icomparable.
- Throw an exceptions would be the most sensible thing. NotEnoughJunkToFillTheRequestedTrunkException
- I was thinking just for individual fields but multiple fields does make more sense.
— Reply to this email directly or view it on GitHubhttps://github.com/MisterJames/GenFu/issues/50#issuecomment-167371034.
@M-Zuber and @gpltaylor: What about something like the following?
GenFu.Configure<Person>()
.UniqueBy(p => p.Firstname)
.UniqueBy(p => p.Lastname);
var people = A.UniqueList<Person>(25);
This would let you build up the list of properties that you want to, in unison, represent a unique entity. The above, for example, would keep a list of hashes on Firstname
and Lastname
and throw out duplicates during generation.
If you hadn't set up configuration for the Person
object, we'd have to resort to using the entire property set to generate a hash...so, we'd be adding a perf hit here.
cc/ @dpaquette @stimms
That looks perfect for me. Ensures that it is very clear in the setup what needs to be unique, and allows for a simple flow of different uniqueness for the same model in different scenarios.
This looks good @MisterJames . If we configure the UniqueBy
properties do we need a separate UniqueList<>
method? Could we just do A.List<Person>(25);
@MisterJames Looks good!
How is this feature going? Is already available? What about the following:
A.Configure<Message>()
.Fill(x => x.Text)
.WithRandom(new String[] { "Hello", "How are you?", "Bye" })
.Distinct();
var messages = A.ListOf<Message>(4000);
If using a cross product between the String array and an Int array the result would be:
"Hello 1"
"How are you? 1"
"Bye 1"
"Hello 2"
"How are you? 2"
"Bye 2"
...
BTW, I am doing this manually as follows:
new String[] {
"Hello", "How are you?", "Bye"
}.SelectMany(x => Enumerable.Range(1, 200).Select(y => $"{x} {y}"));
This provides 600 unique messages ... Of course the ideal would be to have 600 messages but for testing it is not feasible and GenFu can have the most common lists but not all ...
If adding only the internal index the result would be:
"Hello 1"
"How are you? 2"
"Bye 3"
"Hello 4"
"How are you? 5"
"Bye 6"
...
It probably works in all situations where uniqueness is required and there is no limit for size ...
It could also be used something like:
.Distinct(x => $"{x.Text} - {index}");
Where index would be 1, 2, 3, 4, ...
This is something really usefull as often the datastore requires uniqueness.
What do you think?
We're in 2019....... and still missing this feature... Is this package dead? Common.... it's impossible that nobody feel that this is not important......
Looks good!