factory_boy
factory_boy copied to clipboard
Support for async?
The problem
Coming from Django where we used Factory Boy really a lot to a new, async stack to fully support GraphQL with subscriptions which are really cool (uvicorn + Starlette + Ariadne) we also switched to async ORM (not really an ORM) named GINO. It is based on SQLAlchemy Core and works pretty robust. However, I am struggling to adapt Factory Boy to use GINO models.
Proposed solution
At first glance I thought that I need to implement _create()
method in my factory model but the problem is that the create()
method for GINO model is a coroutine and can't be called from a synchronous code. I tried to experiment with asyncio._get_running_loop()
but I am really new to async stuff and my attempt failed.
Extra notes
I am using pytest with pytest-asyncio plugin to run tests with async code which works pretty well including working with DB. For that I have this in my conftest.py:
@pytest.fixture(scope="session")
def event_loop():
"""
This is to make the asyncio event loop shared for the whole test session, otherwise
it will be recreated for each test which will prevent using the test_db fixture.
"""
loop = asyncio.get_event_loop()
yield loop
loop.close()
@pytest.fixture(autouse=True, scope="session")
async def test_db(request):
"""
Here is some DB preparation code like (re)creating DB itself, making sure we have all
necessary rights etc.
"""
await db.gino.create_all() # this is to bind the GINO engine to DB
yield # passing context back to the tests
await db.pop_bind().close() # unbinding engine and performing other teardown later
I really miss Factory Boy and hope there is an easy solution to start my factories again. I also created an issue for GINO here https://github.com/fantix/gino/issues/608 but decided to open one here too as I think Factory Boy developed a much wider community and I have better chances that someone has the same problem as I do. Thanks all!
Sometimes you just need to lay down your thoughts to get the proper idea. Also, the fresh mind helps (I was doing my experiments at 4am yesterday:)
I am not sure this is correct way to proceed and whether there are some unforeseen consequences that will shot me in the knee later but here's what I did:
import uuid
import factory
from database import models
class UserFactory(factory.Factory):
class Meta:
model = models.User
nickname = factory.Sequence(lambda n: f"Test User {n}")
uuid = factory.LazyAttribute(lambda _: str(uuid.uuid4()))
@classmethod
def _create(cls, model_class, *args, **kwargs):
async def create_coro(*args, **kwargs):
return await model_class.create(*args, **kwargs)
return create_coro(*args, **kwargs)
Then in my test I do
new_user = await UserFactory()
and get my new shiny user object created properly in DB! So far I am very happy with the result.
If more wise and experienced developers won't see any issues with this approach I think it may worse adding something like this to the recipes section as async stack is getting more and more popular. I am leaving this issue open for now as I hope there will be some comments and/or advice. If not, it is absolutely fine to close it.
Hey, do you know how I can maybe define awaitable Mock in your code? I need to define abstract without ORM factory like that:
class ObjectFactory(factory.Factory):
class Meta:
abstract = True
model = Mock
@classmethod
def _create(cls, model_class, *args, **kwargs):
async def create_coro(*args, **kwargs):
return await model_class(*args, **kwargs)
return create_coro(*args, **kwargs)
But mock
is not awaitable. I'm trying to figure that out.
https://mock.readthedocs.io/en/latest/changelog.html#b1 has AsyncMock
Thanks, that really helped me. I extended your version to support more features.
class AsyncFactory(factory.Factory):
@classmethod
def _create(cls, model_class, *args, **kwargs):
async def maker_coroutine():
for key, value in kwargs.items():
# when using SubFactory, you'll have a Task in the corresponding kwarg
# await tasks to pass model instances instead
if inspect.isawaitable(value):
kwargs[key] = await value
# replace as needed by your way of creating model instances
return await model_class.create_async(*args, **kwargs)
# A Task can be awaited multiple times, unlike a coroutine.
# useful when a factory and a subfactory must share a same object
return asyncio.create_task(maker_coroutine())
@classmethod
async def create_batch(cls, size, **kwargs):
return [await cls.create(**kwargs) for _ in range(size)]
class UserFactory(AsyncFactory):
...
class Category(AsyncFactory):
...
creator = factory.SubFactory(UserFactory)
class ArticleFactory(AsyncFactory):
...
author = factory.SubFactory(UserFactory)
category = factory.SubFactory(CategoryFactory, creator=factory.SelfAttribute(..author))
In the following example:
article = await ArticleFactory.create()
assert article.author == article.category.creator
The _create
function of UserFactory is called to create the Article Author, this returns a Task.
Then the _create_
function of Category is called, with the User creation Task in its kwarg, which is awaited. The category model creation can use the User instance.
Finally the _create
function of Article is called, also with the User creation Task. It is awaited again. The user instance is used in the article creation.
Here comes my solution to this.
It allows to provide a custom coroutine for saving the object in the database. It also allows to update the object before returning, so that you can still have an id generated on the server side.
Partially inspired by @nadege :smile:
class AsyncFactoryOptions(factory.base.FactoryOptions):
def _build_default_options(self):
def is_coroutine(meta, value):
if not inspect.iscoroutinefunction(value):
raise TypeError(f"{repr(value)} is not a coroutine, but {type(value)}")
return super()._build_default_options() + [
factory.base.OptionDefault("save_coroutine", None, inherit=True, checker=is_coroutine)
]
class AsyncFactory(factory.Factory):
_options_class = AsyncFactoryOptions
@classmethod
def _create(cls, model_class, *args, **kwargs):
"""
This method saves the object using asynchronous save function.
If the coroutine returns a value, it's expected to be the same type as it's parameter.
This logic is made so that eg. id can be generated on the database side.
"""
return asyncio.get_event_loop().run_until_complete(cls._create_async(model_class, *args, **kwargs))
@classmethod
async def _create_async(cls, model_class, *args, **kwargs):
for key, value in kwargs.items():
# when using SubFactory, you'll have a Task in the corresponding kwarg
# await tasks to pass model instances instead
if inspect.isawaitable(value):
kwargs[key] = await value
obj = model_class(*args, **kwargs)
updated_obj = await cls._meta.save_coroutine(obj)
if updated_obj is None:
return obj
if not isinstance(updated_obj, model_class):
raise TypeError(
f"Object returned from the save_coroutine has different type than factory's model. "
f"Expected: {model_class}, Got: {type(updated_obj)}"
)
return updated_obj
Usage:
class MountainFactory(AsyncFactory):
class Meta:
model = Mountain
save_coroutine = save_mountain
I also tried to use factory boy with an async ORM and tried to use RelatedFactory.
At that point just returning a task in not enough, so I've tried to replace a StepBuilder in _generate
method
I got this:
import inspect
import factory
from factory.builder import StepBuilder, BuildStep, parse_declarations
class AsyncFactory(factory.Factory):
@classmethod
async def _generate(cls, strategy, params):
if cls._meta.abstract:
raise factory.errors.FactoryError(
"Cannot generate instances of abstract factory %(f)s; "
"Ensure %(f)s.Meta.model is set and %(f)s.Meta.abstract "
"is either not set or False." % dict(f=cls.__name__))
step = AsyncStepBuilder(cls._meta, params, strategy)
return await step.build()
@classmethod
async def _create(cls, model_class, *args, **kwargs):
for key, value in kwargs.items():
if inspect.isawaitable(value):
kwargs[key] = await value
return await model_class.create(*args, **kwargs)
@classmethod
async def create_batch(cls, size, **kwargs):
return [await cls.create(**kwargs) for _ in range(size)]
class AsyncStepBuilder(StepBuilder):
# Redefine build function that await for instance creation and awaitable postgenerations
async def build(self, parent_step=None, force_sequence=None):
"""Build a factory instance."""
# TODO: Handle "batch build" natively
pre, post = parse_declarations(
self.extras,
base_pre=self.factory_meta.pre_declarations,
base_post=self.factory_meta.post_declarations,
)
if force_sequence is not None:
sequence = force_sequence
elif self.force_init_sequence is not None:
sequence = self.force_init_sequence
else:
sequence = self.factory_meta.next_sequence()
step = BuildStep(
builder=self,
sequence=sequence,
parent_step=parent_step,
)
step.resolve(pre)
args, kwargs = self.factory_meta.prepare_arguments(step.attributes)
instance = await self.factory_meta.instantiate(
step=step,
args=args,
kwargs=kwargs,
)
postgen_results = {}
for declaration_name in post.sorted():
declaration = post[declaration_name]
declaration_result = declaration.declaration.evaluate_post(
instance=instance,
step=step,
overrides=declaration.context,
)
if inspect.isawaitable(declaration_result):
declaration_result = await declaration_result
postgen_results[declaration_name] = declaration_result
self.factory_meta.use_postgeneration_results(
instance=instance,
step=step,
results=postgen_results,
)
return instance
I'm trying to avoid using asyncio.create_task as I want to control the order in which models instantiated.
So I've directly awaited factory_meta.instantiate
method and after awaited all awaitable post-generations
Another version that works with async SQLAlchemy:
import factory
from factory.alchemy import SESSION_PERSISTENCE_COMMIT, SESSION_PERSISTENCE_FLUSH
from factory.builder import StepBuilder, BuildStep, parse_declarations
class AsyncFactory(factory.alchemy.SQLAlchemyModelFactory):
@classmethod
async def _generate(cls, strategy, params):
if cls._meta.abstract:
raise factory.errors.FactoryError(
"Cannot generate instances of abstract factory %(f)s; "
"Ensure %(f)s.Meta.model is set and %(f)s.Meta.abstract "
"is either not set or False." % dict(f=cls.__name__))
step = AsyncStepBuilder(cls._meta, params, strategy)
return await step.build()
@classmethod
async def _create(cls, model_class, *args, **kwargs):
for key, value in kwargs.items():
if inspect.isawaitable(value):
kwargs[key] = await value
return await super()._create(model_class, *args, **kwargs)
@classmethod
async def create_batch(cls, size, **kwargs):
return [await cls.create(**kwargs) for _ in range(size)]
@classmethod
async def _save(cls, model_class, session, args, kwargs):
session_persistence = cls._meta.sqlalchemy_session_persistence
obj = model_class(*args, **kwargs)
session.add(obj)
if session_persistence == SESSION_PERSISTENCE_FLUSH:
await session.flush()
elif session_persistence == SESSION_PERSISTENCE_COMMIT:
await session.commit()
return obj
class AsyncStepBuilder(StepBuilder):
# Redefine build function that await for instance creation and awaitable postgenerations
async def build(self, parent_step=None, force_sequence=None):
"""Build a factory instance."""
# TODO: Handle "batch build" natively
pre, post = parse_declarations(
self.extras,
base_pre=self.factory_meta.pre_declarations,
base_post=self.factory_meta.post_declarations,
)
if force_sequence is not None:
sequence = force_sequence
elif self.force_init_sequence is not None:
sequence = self.force_init_sequence
else:
sequence = self.factory_meta.next_sequence()
step = BuildStep(
builder=self,
sequence=sequence,
parent_step=parent_step,
)
step.resolve(pre)
args, kwargs = self.factory_meta.prepare_arguments(step.attributes)
instance = await self.factory_meta.instantiate(
step=step,
args=args,
kwargs=kwargs,
)
postgen_results = {}
for declaration_name in post.sorted():
declaration = post[declaration_name]
declaration_result = declaration.declaration.evaluate_post(
instance=instance,
step=step,
overrides=declaration.context,
)
if inspect.isawaitable(declaration_result):
declaration_result = await declaration_result
postgen_results[declaration_name] = declaration_result
self.factory_meta.use_postgeneration_results(
instance=instance,
step=step,
results=postgen_results,
)
return instance
Coming from a Django background and with the async Django ORM added, def willing to add a PR for async capability + Django async capability. The examples above create a new class, which is preferable in most libraries, but I think in this case: creating an "a" prefix method would work best in case someone wants to use both sync and async methods but reuse the declarations.
If anyone needs a Django version. Note this utilizes Django 4.2's new addition of "asave", but it'll take account for it if you are on a lower version.
https://gist.github.com/Andrew-Chen-Wang/59d784496c63ee65714b926d6945b4c6
Factory implementation:
import inspect
import factory
from asgiref.sync import sync_to_async
from django.db import IntegrityError
from factory import errors
from factory.builder import BuildStep, StepBuilder, parse_declarations
def use_postgeneration_results(self, step, instance, results):
return self.factory._after_postgeneration(
instance,
create=step.builder.strategy == factory.enums.CREATE_STRATEGY,
results=results,
)
factory.base.FactoryOptions.use_postgeneration_results = use_postgeneration_results
class AsyncFactory(factory.django.DjangoModelFactory):
@classmethod
async def _generate(cls, strategy, params):
if cls._meta.abstract:
raise factory.errors.FactoryError(
"Cannot generate instances of abstract factory %(f)s; "
"Ensure %(f)s.Meta.model is set and %(f)s.Meta.abstract "
"is either not set or False." % dict(f=cls.__name__)
)
step = AsyncStepBuilder(cls._meta, params, strategy)
return await step.build()
class Meta:
abstract = True # Optional, but explicit.
@classmethod
async def _get_or_create(cls, model_class, *args, **kwargs):
"""Create an instance of the model through objects.get_or_create."""
manager = cls._get_manager(model_class)
assert "defaults" not in cls._meta.django_get_or_create, (
"'defaults' is a reserved keyword for get_or_create "
"(in %s._meta.django_get_or_create=%r)"
% (cls, cls._meta.django_get_or_create)
)
key_fields = {}
for field in cls._meta.django_get_or_create:
if field not in kwargs:
raise errors.FactoryError(
"django_get_or_create - "
"Unable to find initialization value for '%s' in factory %s"
% (field, cls.__name__)
)
key_fields[field] = kwargs.pop(field)
key_fields["defaults"] = kwargs
try:
instance, _created = await manager.aget_or_create(*args, **key_fields)
except IntegrityError as e:
get_or_create_params = {
lookup: value
for lookup, value in cls._original_params.items()
if lookup in cls._meta.django_get_or_create
}
if get_or_create_params:
try:
instance = await manager.aget(**get_or_create_params)
except manager.model.DoesNotExist:
# Original params are not a valid lookup and triggered a create(),
# that resulted in an IntegrityError. Follow Django’s behavior.
raise e
else:
raise e
return instance
@classmethod
async def _create(cls, model_class, *args, **kwargs):
"""Create an instance of the model, and save it to the database."""
if cls._meta.django_get_or_create:
return await cls._get_or_create(model_class, *args, **kwargs)
manager = cls._get_manager(model_class)
return await manager.acreate(*args, **kwargs)
@classmethod
async def create_batch(cls, size, **kwargs):
"""Create a batch of instances of the model, and save them to the database."""
return [await cls.create(**kwargs) for _ in range(size)]
@classmethod
async def _after_postgeneration(cls, instance, create, results=None):
"""Save again the instance if creating and at least one hook ran."""
if create and results:
# Some post-generation hooks ran, and may have modified us.
if hasattr(instance, "asave"):
await instance.asave()
else:
await sync_to_async(instance.save)()
class AsyncBuildStep(BuildStep):
async def resolve(self, declarations):
self.stub = factory.builder.Resolver(
declarations=declarations,
step=self,
sequence=self.sequence,
)
for field_name in declarations:
attr = getattr(self.stub, field_name)
if inspect.isawaitable(attr):
attr = await attr
self.attributes[field_name] = attr
class AsyncStepBuilder(StepBuilder):
# Redefine build function that await for instance creation and awaitable postgenerations
async def build(self, parent_step=None, force_sequence=None):
"""Build a factory instance."""
# TODO: Handle "batch build" natively
pre, post = parse_declarations(
self.extras,
base_pre=self.factory_meta.pre_declarations,
base_post=self.factory_meta.post_declarations,
)
if force_sequence is not None:
sequence = force_sequence
elif self.force_init_sequence is not None:
sequence = self.force_init_sequence
else:
sequence = self.factory_meta.next_sequence()
step = AsyncBuildStep(
builder=self,
sequence=sequence,
parent_step=parent_step,
)
await step.resolve(pre)
args, kwargs = self.factory_meta.prepare_arguments(step.attributes)
instance = self.factory_meta.instantiate(
step=step,
args=args,
kwargs=kwargs,
)
if inspect.isawaitable(instance):
instance = await instance
postgen_results = {}
for declaration_name in post.sorted():
declaration = post[declaration_name]
declaration_result = declaration.declaration.evaluate_post(
instance=instance,
step=step,
overrides=declaration.context,
)
if inspect.isawaitable(declaration_result):
declaration_result = await declaration_result
postgen_results[declaration_name] = declaration_result
postgen = self.factory_meta.use_postgeneration_results(
instance=instance,
step=step,
results=postgen_results,
)
if inspect.isawaitable(postgen):
await postgen
return instance
@B3QL How do you recommend using your AsyncFactory implementation? I feel like I'm doing something wrong here.
I've defined a Person as such:
class PersonFactory(AsyncFactory):
class Meta:
model = Person
id = factory.Faker("uuid4")
first_name = factory.Faker("first_name")
last_name = factory.Faker("last_name")
dob = factory.Faker("date_of_birth", minimum_age=18, maximum_age=90)
gender = factory.Faker("random_element", elements=("Male", "Female"))
I'm using pytest, my models are defined with SQLAlchemy 2, and my DB connections are async. Here's the fixture I'm using to get my DB sessions during tests:
@pytest.fixture
async def dbsession(
_engine: AsyncEngine,
) -> AsyncGenerator[AsyncSession, None]:
connection = await _engine.connect()
trans = await connection.begin()
session_maker = async_sessionmaker(
connection,
expire_on_commit=False,
)
session = session_maker()
try:
yield session
finally:
await session.close()
await trans.rollback()
await connection.close()
Now, if I try to use PersonFactory
like the typical Faker use case:
@pytest.mark.anyio
async def test_videomeeting_creation(
fastapi_app: FastAPI,
client: AsyncClient,
dbsession: AsyncSession,
) -> None:
person_factory = PersonFactory()
person = await person_factory.create()
...
I hit this error, since sqlalchemy_session
isn't defined under Meta
:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../async_factories.py:20: in _generate
return await step.build()
../async_factories.py:72: in build
instance = await self.factory_meta.instantiate(
../async_factories.py:27: in _create
return await super()._create(model_class, *args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
cls = <class 'vienna.tests.factories.PatientFactory'>
model_class = <class 'vienna.db.models.patient_model.Person'>, args = ()
kwargs = {'dob': datetime.date(1994, 4, 26), 'first_name': 'Christine', 'gender': 'Female', 'id': '7ad191ee-9e3e-484b-af25-d267b5ab2870', ...}
session_factory = None, session = None
@classmethod
def _create(cls, model_class, *args, **kwargs):
"""Create an instance of the model, and save it to the database."""
session_factory = cls._meta.sqlalchemy_session_factory
if session_factory:
cls._meta.sqlalchemy_session = session_factory()
session = cls._meta.sqlalchemy_session
if session is None:
> raise RuntimeError("No session provided.")
E RuntimeError: No session provided.
My temporary solution is a class construction function that allows me to pass my session from inside my tests:
def get_person_factory(dbsession: AsyncSession):
class PersonFactory(AsyncFactory):
class Meta:
model = Person
sqlalchemy_session = dbsession
id = factory.Faker("uuid4")
first_name = factory.Faker("first_name")
last_name = factory.Faker("last_name")
dob = factory.Faker("date_of_birth", minimum_age=18, maximum_age=90)
gender = factory.Faker("random_element", elements=("Male", "Female"))
return PersonFactory
But I can't help but feel this is not how you intended it to be used...
I found this python package that implements some of the suggestions in this thread: https://github.com/kuzxnia/async_factory_boy/
Is official async support planned for this project?
Async + Django testing (pytest) is a mess. We moved away from it altogether.