synth
synth copied to clipboard
Probabilistically Distribute data in arrays
Required Functionality A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
I need data that "looks" a certain way. For my specific use case, a list of 100ish company names with a normal or poisson distribution.
Proposed Solution What you would see the solution looking like. If you have no idea - just leave empty.
Apologies if there's already a way to do this- I couldnt find one in the docs.
Add a "distribution" block to arrays with the parameters? It'd also be nice if there was some way to calculate an appropriate distribution from prod data.
Use case Some background on the usecase that this feature would address.
I'm writing a set of scripts to chart my bank transactions. I'd like to blog about it, but I don't feel comfortable posting the businesses I'm frequently at. So I'd like to generate replacements with synth
Hey @bbkane! Thanks for the request!
You're quite right, there's no way to do this at the moment. Best you can do is use categorical with preset values.
I think we should definitely support this. I think your suggestion is the right way to go: we could have a "distribution"
parameter that takes in the distribution you want (Gaussian, Poisson), and the corresponding parameters. You'll also probably want to specify the number of unique values that are sampled from (i.e. number of unique companies).
We'll probably want to implement that as a parameter to add on number and string generators. I am not quite sure we can tweak the distribution of values picked by fake-rs
so we may need to refactor a bit of the string generators.
Ooh categorical
might be enough for me to hack it till it's implemented :)
Thanks! Ben
I still haven't done this, and I can't see myself doing it soon, so closing this issue
I'm just keeping this open as a future feature request