firestore-backfire icon indicating copy to clipboard operation
firestore-backfire copied to clipboard

Import stops at a file

Open nonoumasy opened this issue 1 year ago • 9 comments

I tried importing data back to a test Firestore project to test out importing with this library. The import was taking more than an hour so I stopped it. I tried again several times with the same result. How long would a 88mb file take to import? I used this:

backfire import hm.ndjson --paths stories -K testCredentials.json -P test-hm-bq0gl2 --verbose true

When I look at the logs, it stops at a file. I stopped the import and then restarted it again. It didn't overwrite the documents I exported already and it continued to export new documents, but then it stopped again:

Screenshot 2023-05-22 at 10 32 40 AM

I deleted the collection from the project and tried again from scratch with the same results.

When I deleted the collection from Firestore, it deleted a little more than 9000 documents. I think the data I was trying to import has more than 10k documents.

Screenshot 2023-05-22 at 10 43 48 AM

Here are the Firestore rules(just in case):

rules_version = '2';
service cloud.firestore {
  match /databases/{database}/documents {
    match /{document=**} {
      allow read, write;
    }
  }
}

nonoumasy avatar May 22 '23 07:05 nonoumasy

Hi @nonoumasy,

88mb of data definitely shouldn't take over an hour to import 😅 I can see in your screenshot that there were quite a few documents that failed to import. Were there any log messages that described why those imports failed?

I suspect those errors may be blocking the thread.

Could you also please provide the following:

  • The entire output log of your import (please attach as a txt file)
  • A sample of the different types of data structures you're trying to import, e.g. provide me a document from the stories collection, and a document at the stores/xxx/events collection etc. (please remove any sensitive data)

benyap avatar May 23 '23 01:05 benyap

Hi @benyap,

I didn't see any messages that described the failed imports. It just runs and then at some point, it stops. Here is the import output: Terminal Saved Output.txt

The first time I ran this, it was logging the import line by line. Then I started over and re-ran it the second time, it didn't log line by line. It just logged what you see in the above file. Kinda random.

Here is a sample of of the data structures used in both stories and events collections: temp.ndjson.zip

nonoumasy avatar May 23 '23 16:05 nonoumasy

Hey @nonoumasy, is your Firebase project on the (free) Spark plan? You may be hitting some write limits if you're not on the Blaze plan (pay as you go).

I've been doing some testing on a Firebase project on the Blaze plan. I wrote a script to generate random data, and here is a sample of what I'm uploading - seed.ndjson.zip - fairly similar to your sample. This is what I was able to achieve with the library:

Test 1

  • Imported 10k documents in under 2 minutes.
  • Data size: 24 MB
  • Average document size: 2.4 KB

Test 2

  • Imported ~170k documents. Took 32 minutes.
  • Data size: 183 MB
  • Average document size: 1 KB

Test 3

  • Imported ~430k documents. Took 1h 20m.
  • Data size: 832 MB
  • Average document size: 1.9 KB

benyap avatar May 26 '23 14:05 benyap

@benyap YES! it is a Spark plan. That makes sense. ok I'll switch it to the Blaze plan and try again. I'll let you know. Thanks! Is there a way to force the verbose so I see the current files being processed. It saw it before, but now I don't see it anymore. I am using the verbose option. I'm not sure if its stuck again. This is what it shows:

Screenshot 2023-05-26 at 8 27 39 PM

UPDATE: I tried importing again and let it go for 10 minutes but it seems to be stopping again. I keep refreshing but I didn't see the Firestore data changing at all. If you are able to upload that data in under 2 minutes, I think there is still an issue.

nonoumasy avatar May 26 '23 16:05 nonoumasy

How long after switching to the Blaze plan did you run this? You may need to wait a few hours. I got stuck when I was on the Spark plan, and for a little bit after switching too. But an hour or so after the switch, everything was fine and I was able to get the test results mentioned in my last post.

When --verbose is on, it should always print out a message if a batched write is successful - see here. If you're not seeing the log, I suspect it's stuck on something.

Perhaps if you don't mind emailing me the data you're trying to upload, I'll see if I can replicate your issue on my end (please remove any sensitive data).

benyap avatar May 27 '23 05:05 benyap

@benyap I tried again today(atleast 12 hours later), it had the same result. It stopped at a file. Terminal Saved Output.txt

I'll email the data to you.

nonoumasy avatar May 27 '23 05:05 nonoumasy

@benyap do you think it has to do with cold start? I don't use that project at all except for tests. What you described also sounds like a 'cold start' issue. While Firestore itself might not experience it, firebase cloud functions does.

nonoumasy avatar Jun 01 '23 07:06 nonoumasy

Hi @nonoumasy,

I don't think it's a cold start problem. The testing I mentioned in my previous comment was also done on a test project I made just to test Firestore.

Thanks for sending your file through. I've tried importing your data into my test project. I was able to get it to work successfully a few times (all uploaded under 2 minutes). However, I also did encounter the freezing of the program you mentioned several times too. It certainly seems very inconsistent and I don't have an answer as to why yet.

There's a chance it could have something to do with the content in your documents as I notice there's some characters from other languages. I'll dig deeper and I'll let you know if I work anything out 😁

benyap avatar Jun 04 '23 00:06 benyap

Hi @nonoumasy,

I don't think it's a cold start problem. The testing I mentioned in my previous comment was also done on a test project I made just to test Firestore.

Thanks for sending your file through. I've tried importing your data into my test project. I was able to get it to work successfully a few times (all uploaded under 2 minutes). However, I also did encounter the freezing of the program you mentioned several times too. It certainly seems very inconsistent and I don't have an answer as to why yet.

There's a chance it could have something to do with the content in your documents as I notice there's some characters from other languages. I'll dig deeper and I'll let you know if I work anything out 😁

Thanks @benyap yes. My data has a few properties like description which is translated (to 7 other languages: ar, es, fr, ja, pt, ru, zh) to storyDescriptionTranslated and eventDescriptionTranslated (depending if its the main story collection or the event collection). Cheers

nonoumasy avatar Jun 05 '23 23:06 nonoumasy