foundry icon indicating copy to clipboard operation
foundry copied to clipboard

Insert a lot of data - batch processing

Open seb-jean opened this issue 7 months ago • 5 comments

Hi,

I'd like to insert a lot of data into my User table, which contains many columns, but it's taking a long time. So I used another entity called Message, which has a few columns. Unfortunately, importing 1 million rows is also taking a long time. It's been going on for over an hour. I'm wondering if it's possible to insert a lot of data with Foundy. Would a batch processing solution offered by Doctrine be worth implementing in Foundry? Thanks :).

<?php

namespace App\DataFixtures;

use App\Entity\User;
use App\Factory\MessageFactory;
use Doctrine\Bundle\FixturesBundle\Fixture;
use Doctrine\Persistence\ObjectManager;
use function Zenstruck\Foundry\Persistence\repository;

use function Zenstruck\Foundry\Persistence\flush_after;

final class AppFixtures extends Fixture
{
    public function load(ObjectManager $manager): void
    {
        flush_after(function () {
            $user1 = repository(User::class)->find(805);
            $user2 = repository(User::class)->find(804);

            MessageFactory::createMany(1000000, attributes: fn ($i) => [
                'sender' => $user1,
                'receiver' => $user2,
            ]);
        });
    }
}

seb-jean avatar Apr 20 '25 21:04 seb-jean

Hi!

yes, this could actually be possible I think.

Maybe we should introduce a new method that don't return anything, so that all the objects won't be loaded in memory

Then we should transform FactoryCollection::all() so that it returns a generator. But it would a BC break, so we have to find a solution for this, maybe introduce a new private method.

Finally, the batch handling should take place in the PersistenceManager

@kbond any thoughts?

nikophil avatar Apr 21 '25 16:04 nikophil

Yeah, not a lot of effort has gone into making Foundry performant. That being said, we have been improving it lately. Because of the LazyObjectManager, clearing the EM shouldn't be an issue. I don't know that time would be improved much but certainly memory usage.

kbond avatar Apr 22 '25 13:04 kbond

I think that currently there is no way to do this, because of memory exploding, but yeah, Foundry would not be the best to create tons of data.

On the other hand, I don't think this would be to hard to do, so I'm not opposed to it

nikophil avatar Apr 22 '25 14:04 nikophil

What would be the best library for creating tons of data?

seb-jean avatar May 18 '25 15:05 seb-jean

don't know...

maybe custom stuf? depends on your needs.

basically, I'd say if you want to create tons of data and if you care about performances, maybe you should not even use doctrine/orm at all.

And I think we're open to a contribution that unlocks batch processing in Foundry. Just be aware that it might not be the right tool to create huge amount of data.

nikophil avatar May 18 '25 20:05 nikophil