firebase-rest-api icon indicating copy to clipboard operation
firebase-rest-api copied to clipboard

[Bug]: list_of_documents does not fetch all collection entries

Open flashsturz opened this issue 2 years ago • 6 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Environment

-   OS: Win11
-   Python: 3.9.16

What happened?

When using the command list_of_documents() to get all entries from a collection, it seems that only a subset of the whole collection is returned. This is further confirmed when having a look at collection.py in the google_cloud_firestore library in method list_documents(). The function list_documents takes as argument an optional "page_size", which determines how many documents will be displayed/returned. This is set per default by the API, but it is not certain that this command returns all available documents in the collection (e.g. if you have several hundred documents.)

I currently do not know how to work around this issue. Any ideas?

Code Snippet

# I used the following command to fetch all documents of a large collection:

active_ids = fsdb.collection("active_files").list_of_documents(self.user["idToken"])
#Returns only a subset of all documents in "active_files", accessible through the firestore console.

Relevant log output

No response

Anything else?

No response

flashsturz avatar Mar 23 '23 13:03 flashsturz

@flashsturz Does the issue take place if when you're using service account credentials? Because for service account it uses google_cloud_firebase library, meanwhile for user based, it used an API endpoint. The code snippet provided uses the API endpoint, while in description, it's mentioning the official library.

AsifArmanRahman avatar Mar 28 '23 21:03 AsifArmanRahman

I use a user-based account and can confirm that it uses the API endpoint to get the documents. Sorry about the confusion. Is there a parameter that we can pass with the request header in list_of_documents() to specify that we want to fetch all available documents from req_ref?

flashsturz avatar Mar 29 '23 06:03 flashsturz

@flashsturz The method is supposed to return all the documents, the API endpoint wasn't set properly, which will be fixed in the next version.

This is set per default by the API, but it is not certain that this command returns all available documents in the collection (e.g. if you have several hundred documents.)

Were you able to find out this default value use in the official library? I wonder if it's a constant number, or it changes based on something, which an AI might determine.

AsifArmanRahman avatar Apr 06 '23 17:04 AsifArmanRahman

Unfortunately, I do not know how the default value is chosen, since the documentation of the firestore library is not exactly stating how it is set:

page_size (Optional[int]]): The maximum number of documents in each page of results from this request. Non-positive values are ignored. Defaults to a sensible value set by the API.

(From list_documents in collection.py in the library firestore_v1 library)

flashsturz avatar Apr 07 '23 10:04 flashsturz

Sorry to get back at it this late, but from what I checked, The REST API for firestore doesn't return all documents at once, it returns a next page token, using which further ones need to be retrieved. Now it would mean the library will have to make multiple requests before providing a result of all documents. And in case of large number of documents, it might take a long time. I could enable passing of page_size argument, so for cases where the no of docs is supposed to be large, the dev can a larger page size. I'll set the default page size to 20, as tradition.

Does it seem fine? @flashsturz

AsifArmanRahman avatar May 17 '23 04:05 AsifArmanRahman

Sorry for the delayed response from my side. Yes, I think it might be good to give the user the ability to set the page_size argument. Do you know how large the page_size can be chosen, e.g. if there is an upper limit? Or is it possible to specify a very large number in order to ensure that all documents fit on one page?

flashsturz avatar May 23 '23 14:05 flashsturz