prisma-client-go icon indicating copy to clipboard operation
prisma-client-go copied to clipboard

Add CreateMany

Open steebchen opened this issue 2 years ago • 13 comments

steebchen avatar Aug 23 '21 17:08 steebchen

Hi, is this a good first issue? I would love to give it a try, do you have some directions on where to get started?

MaximilianGaedig avatar Sep 21 '23 13:09 MaximilianGaedig

Kind of, yeah. All generator related code is not the easiest to grasp though, and there are no docs for Prisma AST. There are some tricks to see the info etc. though and I could write up some doc, but it definitely takes some time to understand the concepts and how the Prisma internals work (which are not optimized for Go but rather JS). Also, before starting, it would be important to align on how the syntax would look like.

I'm also not sure about the priority of this, as you can create many docs using transactions right now:

createUserA := client.User.CreateOne(
  User.Email.Set("a"),
  User.ID.Set("a"),
).Tx()

createUserB := client.User.CreateOne(
  User.Email.Set("b"),
  User.ID.Set("b"),
).Tx()

if err := client.Prisma.Transaction(createUserA, createUserB).Exec(ctx); err != nil {
  t.Fatal(err)
}

This would also work with a loop or X docs:

var txs []transaction.Param

for item := someItems {
  tx := client.User.CreateOne(
    User.Email.Set("b"),
    User.ID.Set("b"),
  ).Tx()
  txs = append(txs, tx)
}

if err := client.Prisma.Transaction(txs...).Exec(ctx); err != nil {
  t.Fatal(err)
}

steebchen avatar Sep 21 '23 15:09 steebchen

Hmm yeah.. that's what I am doing right now, problem is the memory usage on that, a transaction is probably quite big, and when passing it to the prisma intermediate service memory usage is horrendous: image I am hoping this could reduce the memory usage. Or is it just a prisma service issue handling the usecase of inserting ~70k records a lot of times without going above crazy memory usage

Even in my process all the transactions when batching them in batches of 20k I experience 200MB of memory usage on that alone (which would be okay, but far from what CreateMany could achieve with instead of adding transactions to a list we could add Models to the list for example)

MaximilianGaedig avatar Sep 21 '23 22:09 MaximilianGaedig

Syntax wise I imagined it to be something like this:

userA := db.User{ id:"a", email:"a" }
userB := db.User{ id:"b", email:"b" }

if err := client.User.CreateMany(userA, userB).Exec(ctx); err != nil {
  log.Fatal(err)
}

MaximilianGaedig avatar Sep 21 '23 22:09 MaximilianGaedig

Well, internally a CreateMany would also run in a transaction, but potentially it's more optimized, but I'm not sure. Since you are saying "what CreateMany could achieve" – did you actually test this? You could do so with the JS client and see if CreateMany is faster than CreateOne with transactions.

Regarding the syntax, this would not really work, as the syntax would be different from CreateOne and it would not benefit from the extra type-safety (think about also linking records).

steebchen avatar Sep 22 '23 13:09 steebchen

Hmm okay, will try to check it out in the JS client, this was just an assumption from me for now

MaximilianGaedig avatar Sep 26 '23 10:09 MaximilianGaedig

@MaximilianGaedig Did you get a chance to see if it was faster for transactions?

sx328 avatar Oct 12 '23 08:10 sx328

Not yet, but it's on my TODO list, I needed more control over the database for the project I used this in anyways tho so I went to pgx with that

MaximilianGaedig avatar Oct 16 '23 14:10 MaximilianGaedig

I see, no worries

sx328 avatar Oct 19 '23 17:10 sx328

Also looked for this, would it make sense to just spin up a bunch of goroutines each doing a CreateOne in the meantime? something like

	g, ctx := errgroup.WithContext(ctx)
	g.SetLimit(maxWorkerGoroutines)

	// Producer
	nodeIds := make(chan int)
	g.Go(func() error {
		defer close(nodeIds)
		for i := 0; i < v.Len(); i += 1 {
			nodeIds <- i
		}
		return nil
	})

	type ProcessedNode struct {
		idx int
		res *<result>
	}

	// Mapper
	queue := make(chan ProcessedNode)
	workers := int32(maxWorkerGoroutines)
	for i := 0; i < maxWorkerGoroutines; i++ {
		g.Go(func() error {
			defer func() {
				// decrement worker count
				if atomic.AddInt32(&workers, -1) == 0 {
					close(queue)
				}
			}()

			for idx := range nodeIds {
				if result, err := <createOne>(<params>),
				); err != nil {
					return err
				} else {
					queue <- ProcessedNode{idx: idx, res: result}
				}
			}
			return nil
		})
	}

	// Reducer
	results := make([]*<result>, v.Len())
	g.Go(func() error {
		for nodeRes := range queue {
			results[nodeRes.idx] = nodeRes.res
		}
		return nil
	})

	return results, g.Wait()

nettrino avatar Nov 28 '23 08:11 nettrino

@nettrino Yes this would work, with the disadvantage that it doesn't run in a transaction. So it might work depending on the use-case.

steebchen avatar Nov 28 '23 08:11 steebchen

@steebchen my understanding is that a transaction is an all-or-nothing operation, where statements are executed in order as passed, whereas my above suggestion is just for records that could be created asynchronously as needed without depending on one another - is that assumption correct?

nettrino avatar Nov 28 '23 11:11 nettrino

Yes, correct. Feel free to ask over discord as well

steebchen avatar Nov 28 '23 14:11 steebchen