factory_boy
factory_boy copied to clipboard
The unreleased `Transformer` has a conflicting API with the overall design
Description
For most declarations, I can override the default mechanism by passing an explicit value:
>>> UserFactory().email
"[email protected]"
>>> UserFactory(email="[email protected]").email
"[email protected]"
If the factory uses a factory.Transformer
field, I can't force a value:
>>> UserFactory().password
"pbkdf2_sha256$216000$Gj2b9f0Ln1ms$1dBlLEYrtGiwEA219JU831VAdscD2f9nFY37PrNfCDU="
>>> UserFactory(
... password="pbkdf2_sha256$216000$Gj2b9f0Ln1ms$1dBlLEYrtGiwEA219JU831VAdscD2f9nFY37PrNfCDU="
... ).password
"pbkdf2_sha256$216000$EUKlFdNViyB5$vcfPaK6H7pDiQAa9ZIbFoj5oj55tUyHKHDUfxI4GIeY="
As a user, it is quite confusing to have very different behaviours for fields; if I have a specific password hash (in this example), I need to have a way to set it.
Next steps
I'm not sure we can officially release the factory.Transformer
(and related code) without addressing this topic first; how can a user bypass the transform?
@francoisfreitag As the author of the factory.Transformer
feature, I'm quite interested in your opinion here ;)
The use case for Transformer
is to apply the transform
function to the value. I don’t expect Transformer
declarations to be used in cases where the user sometimes want the value to be transformed, sometimes not. A separate helper function would be much clearer for that use case.
I agree an escape hatch would be useful when the vast majority of inputs need transformation, but some special cases are to be tested with a raw value. That’s what RawValue
from #888 provides.
Side note: it felt a bit weird to find it mentioned in the documentation, but then getting a missing import when trying to use it.
Any progress? If this blocking next release, shouldn't it have the highest priority?
I'd say more: I would expect being able to override the declaration with other arbitrary declarations (fakers, self-attributes, lazy-functions, subfactories... anything!), which now will most probably cause a crash because the function is applied to the declaration before it is evaled?
class F(factory.DictFactory):
name = factory.Transform(str.upper, "jane")
F.build(name=factory.Faker("first_name_female", locale="hi")) # crash
Would this be covered by the proposed #888 , @francoisfreitag ?
Side note: it felt a bit weird to find it mentioned in the documentation, but then getting a missing import when trying to use it.
@Kjir If you are reading the "latest" documentation page instead of the "stable" you should be expecting this all the time!
Thanks for the input. I did not have that use in mind when developing the transformers. A Transformer takes a value, applies a transform
function, then the factory uses the transformed value. The issue here is a value generator is being passed to the factory, not an actual value.
I see the following traceback with your example:
Traceback (most recent call last): File "", line 1, in File "/home/freitafr/dev/factory_boy/factory/base.py", line 511, in build return cls._generate(enums.BUILD_STRATEGY, kwargs) File "/home/freitafr/dev/factory_boy/factory/base.py", line 465, in _generate return step.build() File "/home/freitafr/dev/factory_boy/factory/builder.py", line 242, in build pre, post = parse_declarations( File "/home/freitafr/dev/factory_boy/factory/builder.py", line 162, in parse_declarations extra_maybenonpost[k] = pre_declarations[k].declaration.function(v) TypeError: descriptor 'upper' for 'str' objects doesn't apply to a 'Faker' object
Can you explain why the call is passing a Faker object rather that the value generated by Faker?
F.build(name=factory.Faker("first_name_female", locale="hi")) # crash
I would write:
fake = Faker(["hi"])
F.build(name=fake.first_name_female())
The original use case is for Django (or more generally ORMs) to deal with password hashing, see https://github.com/FactoryBoy/factory_boy/issues/366. That allows writing the cleartext password in the factory call: UserFactory.create(password="hunter2")
, yet store the hashed version in the database. See https://factoryboy.readthedocs.io/en/latest/orms.html#factory.django.Password for more.
I'd like to move this forward; here are a couple of additional considerations we should take into account ;)
Requirements
With the following code (mixing examples from @francoisfreitag and @n1ngu):
class UserFactory(factory.Factory):
fullname = factory.Transformer(str.upper, "Joan Deer")
password = factory.Transformer(make_password, "password123")
I believe we want the following options to be possible:
- Providing a value generator (i.e another declaration) and having the transformation apply to it;
- Forcing a specific value;
- Reduce the risk of users "losing" track of the raw value that was provided to the
transform
(in @n1ngu's example, can we keep track of the pre-upper()
name?).
Solving these would also address the topic discussed in #963.
Current behaviour
With the 3.2.1
code, the user can rely on assert MyFactory(field=value).field == value
, except whe the factory performs some mangling through Factory._meta.rename
.
In that situation, achieving the goals of Transformer
would require:
class UserFactory(factory.Factory):
class Meta:
name_raw = factory.Faker("fullname")
name = factory.LazyAttribute(lambda o: o.name_raw.upper())
The user would then use:
-
UserFactory(name_raw="value")
to force a specific pre-transformation value; -
UserFactory(name="value")
to force a specific final value
Alternative implementations
Other options, for instance suggested in #963, would be to declare the transformation as part of the LazyAttribute
or SelfAttribute
:
class UserFactory(factory.Factory):
class Meta:
name_raw = factory.Faker("fullname")
name = factory.SelfAttribute("name_raw", transform=str.upper)
However, that pattern would require the addition of a transform=
keyword argument to all declarations, which would be cumbersome and might conflict with named arguments expected by other declarations (e.g. SubFactory
).
Declarations with similar issues
The following declarations exhibit a similar issue to the proposed factory.Transformer
:
factory.Trait
Once the class declaration of a factory using a Trait
has been parsed by Python,
the Trait
is converted into a set of Maybe
;
callers cannot provide a new value for the trait afterwards, or "delete" the Maybe
portion.
factory.PostGeneration
Those declarations have specific handling: if the caller provides a non-declaration value, it is provided to the function. If, instead, a factory post-generation declaration is provided, it overrides the declaration.
factory.RelatedFactory
Similar to factory.PostGeneration
, a user can bypass a call to a related factory
by providing a nake value to the factory: UserFactory(main_group=SomeGroup)
.
If, instead, they provide another RelatedFactory
declaration, that value would override the existing
RelatedFactory
generation: UserFactory(main_group=factory.RelatedFactory(LimitedGroupFactory))
Possible options
Special declaration
As proposed in #888, and similarly to the implementation for factory.PostGeneration
, we could special-case a declaration:
>>> UserFactory(name="Joan Doer").name
"JOAN DOER"
>>> UserFactory(name=factory.Force("Joan Doer")).name
"Joan Doer"
Specific keyword argument
We could support an explicit keyword argument to force the raw value:
>>> UserFactory(name="Joan Doer").name
"JOAN DOER"
>>> UserFactory(name__="Joan Doer").name
"Joan Doer"
Discussion
The "special declaration" case requires adding a new kwarg, which is only useful for the factory.Transformer
case.
- It wouldn't make sense for
factory.Trait
— as theTrait
declaration has already been parsed at class declaration time. - It wouldn't make sense for
factory.PostGeneration
, as those declarations are supposed to alter the instance, not provide a field. - It would provide a duplicated API for bypassing
factory.RelatedFactory
, where one could call eitherUserFactory(main_group=SomeGroup)
orUserFactory(main_group=factory.Force(SomeGroup))
to bypass the related factory — although the latter might be clearer.
The "specific keyword" pattern is similar to the behaviour used by @post_generation
and PostGenerationMethodCall
:
- The user can call
UserFactory(main_group__=SomeGroup)
to bypass amain_group = factory.RelatedFactory(...)
declaration,; - They can call
UserFactory(make_password__="hunter2")
.
Additional considerations
What happens with the following factories?
class UserFactory(factory.Factory):
credentials = factory.Dict(
password=factory.Transformer(make_password, "hunter2),
)
class userFactory(factory.Factory):
class Params:
with_strong_password = factory.Trait(password="A Very Strong Password !!")
password = factory.Transformer(make_password, "hunter2")
class TenantFactory(factory.Factory):
name = factory.Faker("company_name")
register_main_dns = factory.PostGenerationMethodCall("add_dns_alias")
register_main_dns__ = factory.Transformer(slugify_dns, factory.SelfAttribute("name"))
Conclusion
I believe the best way forward is:
- Fix the
factory.Transformer
declaration to be able to accept either a pre- or post- declaration, asMaybe
andTrait
do; - Enrich it, where a user can provide a forced value to
name = factory.Transformer(...)
withname__ = x
Thanks for sharing the detailed analysis!
I agree that changing factory.Transformer
to accept a pre- declaration would be an improvement, happy to give that a shot.
However, I am against the post- declaration part: the Transformer
is meant to transform the value before to instantiate the object. This declaration was specifically introduced to deal with #366, which was a blocker for #316. My concern with a post-generation would be to keep encouraging the (bad) practice of saving the generated object twice with the CREATE
strategy. If a post- declaration is allowed, an additional save is needed for the object in database to match the Python instance, going against #316. Unless a clear use case is presented, where Transformer
would be a real gain, I’m convinced it’s better not to implement it.
Enrich it, where a user can provide a forced value to name = factory.Transformer(...) with name__ = x
This syntax is very unclear to me. It’s also not documented, not tested and does not appear a single time in the library repo. I would rather consider it an implementation detail of the library, and not start advertising it or even commit to supporting it. A GitHub search (although difficult because the =
sign is ignored) does not yield usage either.
Raw
(or Force
) could be declared inner to Transformer
, so that users need to specify Transformer.Raw
to wrap the value, clarifying its scope and preventing confusion for Trait
or Maybe
:
class Transformer:
class Raw:
# Raw implementation
# Transformer implementation
What do you think?
#1006 offers an update to handle pre-declarations with Transformer
. Please feel free to review it and share your feedback!
@francoisfreitag I like the approach in #1006, let's merge that!
Sorry for being away for so long 😕
That’s great to hear, thanks! No worries, it’s all volunteer work, it always depends on what’s going on IRL :smile:
It appears that the release is being blocked by this #886 however it's suggested that #1006 should be merged, https://github.com/FactoryBoy/factory_boy/pull/1006#issuecomment-1499322525
I would be happy to help fix it, however, I am currently confused as to which is the Solution to be merged?
The solution is #1006, augmented by #1013. I just updated #1006 to pull the improvements from #1013. If @rbarrois agrees, we should be integrating that PR soon.