appwrite icon indicating copy to clipboard operation
appwrite copied to clipboard

🚀 Feature: Bulk Document Creation

Open Shadowfita opened this issue 4 years ago • 28 comments

🔖 Feature description

Create a "createDocuments" post endpoint that takes an array of documents.

🎤 Pitch

In my project, I am trying to insert 12,000 documents in one go. It is inefficient running 12,000 external createDocument API calls to achieve this.

To remedy this, there should be a "createDocuments" endpoint that allows you to pass an array of documents. That way you could easily break up the number of external calls required, and allow the stack to process the creation of the large amount of documents interally, quickly and efficiently.

A potential workaround would be creating a function that does it locally, but I don't believe this is a proper solution.

👀 Have you spent some time to check if this issue has been raised before?

  • [X] I checked and didn't find similar issue

🏢 Have you read the Code of Conduct?

Shadowfita avatar Apr 02 '22 01:04 Shadowfita

This is not achieveable with functions due to the 8192 character limit.

Shadowfita avatar Apr 02 '22 01:04 Shadowfita

I have a achieved an okay workaround by creating a ".json" file inside a bucket that contains an array of documents, which is then read by a function that inserts the required documents.

Shadowfita avatar Apr 02 '22 08:04 Shadowfita

I have a achieved an okay workaround by creating a ".json" file inside a bucket that contains an array of documents, which is then read by a function that inserts the required documents.

It's currently taking about 46 seconds to process 15,863 simple json objects.

Shadowfita avatar Apr 04 '22 09:04 Shadowfita

Need this feature too

zcoderr avatar Apr 16 '22 06:04 zcoderr

I have a achieved an okay workaround by creating a ".json" file inside a bucket that contains an array of documents, which is then read by a function that inserts the required documents.

Hi, can you point me to a demo file please. NodeJS if possible.

elunatix avatar May 21 '22 21:05 elunatix

I have a achieved an okay workaround by creating a ".json" file inside a bucket that contains an array of documents, which is then read by a function that inserts the required documents.

Hi, can you point me to a demo file please. NodeJS if possible.

No worries.

You can find it at the link below. It was thrown together pretty quickly.

It expects a JSON file to be created in a bucket with the following structure

{ 'collection': 'collection_id', 'data': [ 'object-array' ] }

https://gist.github.com/Shadowfita/b5ccd20f65566cb9f2b40d416c5201a2

Shadowfita avatar May 22 '22 08:05 Shadowfita

Would also like a batch delete too.

pilcrowonpaper avatar Jul 31 '22 03:07 pilcrowonpaper

Using functions and bucket is brilliant solution!

Another potential workaround is to use multithreaded client and upload documents in parallel; for instance, client application developed in Kotlin. And, perhaps Server SDK can allow to create server-side extension; or, trivial, server-side application (such as "function", or Kotlin-based standalone) can read batch JSON from local filesystem (from "bucket", FTP, S3, etc) and create records, so we avoid excessive HTTP traffic.

Or, another solution:

  • use "Staging" instance in local network, load data, backup MariaDB, restore MariaDB in production system
  • backup production MariaDB, analyze SQL dump file, programmatically generate additional 12,000 SQL statements, and restore it

The best approach would be if Appwrite API has such functions: export (permissions/objects/buckets/collections/functions) (JSON), import, etc.; in this case we "abstract" underlaying implementation details and can do more granular export/import. For example, right now we don't have implementation-independent backup/restore (except that executing sqldump locally; what about cluster then?)

FuadEfendi avatar Oct 09 '22 01:10 FuadEfendi

It would be great if bulk update is also considered as it ll greatly reduce the number of requests I make in my application.

balachandarlinks avatar Nov 02 '22 10:11 balachandarlinks

Hey @stnguyen90 , Can I work on this. I have gone through the requirements and tried to implement a basic endpoint to achieve this and was able to implement this on my local setup for Bulk create. I would be happy to contribute to this.

singhbhaskar avatar Mar 12 '23 14:03 singhbhaskar

@singhbhaskar, thanks for your interest! 🙏 However, it would be best for the core team to figure out how it should work.

stnguyen90 avatar Mar 14 '23 03:03 stnguyen90

Need this feature too

rafagazani avatar Mar 21 '23 20:03 rafagazani

I Need bulk operations too, please

danilo73r avatar Apr 16 '23 03:04 danilo73r

I need this also, need to create 18K document which are Pincode I should say...

Vedsaga avatar May 04 '23 17:05 Vedsaga

I want to delete multiple docs by their IDs too

ashuvssut avatar Jun 08 '23 17:06 ashuvssut

So I am working on a scrapping project , there are almost 2K records , it takes almost 20 min to write !!!

{"ColumnRef":"Name","index":"0","value":"orange house"}

this is the size of each record !!!

is there any way to speed up the writes !!

Shiba-Kar avatar Aug 07 '23 04:08 Shiba-Kar

So I am working on a scrapping project , there are almost 2K records , it takes almost 20 min to write !!!

{"ColumnRef":"Name","index":"0","value":"orange house"}

this is the size of each record !!!

is there any way to speed up the writes !!

You should run all write requests asynchronously and wrap them in Promise.all . Appwrite is built to scale and will handle that many concurrent requests with no issue, and it will reduce your wait-time exponentionally.

Shadowfita avatar Aug 07 '23 06:08 Shadowfita

@Shadowfita Is there a guarantee that all promises will resolve if we do it with Promise.all now?

Last time I checked like 2 months ago, I tried to do bulk document deletion with Promise.all and it seems some of them failed.

ashuvssut avatar Aug 07 '23 13:08 ashuvssut

Is there a guarantee that all promises will resolve if we do it with Promise.all now?

Answer depends actually,

INstead you should delete in batches, meaning 10-50 per Promise instead of all at one. if you do all at one then it will likely fail because of limitation of the server.

Vedsaga avatar Aug 07 '23 13:08 Vedsaga

Right now I am using that lib for bulk document-from-json creation

tripolskypetr avatar Dec 06 '23 17:12 tripolskypetr