google-cloud-go icon indicating copy to clipboard operation
google-cloud-go copied to clipboard

firestore: iterator timeout issue with non-existent document reference

Open lipchyk opened this issue 1 year ago • 1 comments

Client

cloud.google.com/go/firestore v1.14.0

Go Environment

go version go1.21.5 darwin/arm64

Code

e.g.

func getAllDocuments(ctx context.Context, collection *firestore.CollectionRef) error {
	iter := collection.Documents(ctx)

	for {
		doc, err := iter.Next()
		if err == iterator.Done {
			break
		}

		if err != nil {
			return fmt.Errorf("error getting document: %v\n", err)
		}

		fmt.Println(doc.Ref.ID)
	}

	return nil
}

func getRefsOnly(ctx context.Context, collection *firestore.CollectionRef) error {
	iter := collection.DocumentRefs(ctx)

	for {
		ref, err := iter.Next()
		if err == iterator.Done {
			break
		}

		if err != nil {
			return fmt.Errorf("error getting document ref: %v\n", err)
		}

		fmt.Println(ref.ID)
	}

	return nil
}

Expected behavior

I'm able to iterate through all documents and print their IDs. Or get a go's error which clearly shows that a document is missing.

Actual behavior

I have written a simple Go program to iterate over all documents in a Firestore collection and print their IDs. The program uses the provided iterator but consistently gets stuck and times out after reaching the same document ID each time, the document that appears to be non-existent according to the Firestore Studio. If I use the getRefsOnly function, then it works well and this is exactly how I managed to get the ref ID of a missing document. I'm not even sure how to reproduce this issue and create such a missing document that's returned by getRefsOnly func.

Screenshots

image

lipchyk avatar Feb 16 '24 12:02 lipchyk

triaged labels, feel free to change

noahdietz avatar Feb 21 '24 00:02 noahdietz

Looking into this.

bhshkh avatar Mar 14 '24 06:03 bhshkh

I created the following:

cities collection -> PUN document -> sub-cities collection -> sub-city-doc-1 document

Screenshot 2024-03-15 at 6 20 10 PM Screenshot 2024-03-15 at 6 20 25 PM

Then, I ran below code:

	docRef := client.Collection("cities").Doc("PUN")
	_, err = docRef.Delete(ctx)
	if err != nil {
		fmt.Printf("Delete: %v\n", err)
	}

Now, PUN document shows the message "There's no document in this path yet" (See below screenshot)

Screenshot 2024-03-15 at 6 21 26 PM

In this state, when I run getAllDocuments method, I don't see any errors but PUN document does not get listed:

********* getAllDocuments *********
BJ
DC
LA
SF
TOK
********* getRefsOnly *********
BJ
DC
LA
PUN
SF
TOK

Now, if the subcollection sub-cities is deleted, the PUN document looks like below. This is same state as the document reporter is seeing: Screenshot 2024-03-15 at 6 22 05 PM

Even now, when I run getAllDocuments and getRefsOnly, I don't see any panic or timeout or error:

********* getAllDocuments *********
BJ
DC
LA
SF
TOK

********* getRefsOnly *********
BJ
DC
LA
SF
TOK

And on refresh of pantheon, the document itself does not get displayed. Its unclear how the reporter is still able to see the document reference in the screenshot attached by the reporter. Screenshot 2024-03-15 at 6 34 29 PM

bhshkh avatar Mar 15 '24 13:03 bhshkh

Please try the following workarounds:

  1. Delete the problematic document reference from Pantheon
  2. If deletion is not an option, read the document references using collection.DocumentRefs(ctx) and then call Get on each reference

bhshkh avatar Mar 15 '24 13:03 bhshkh