nodejs-firestore Add option to only return document data

trafficstars

Is your feature request related to a problem? Please describe. Often times, it is known ahead of time that the document reference will not be needed and that all that is needed is the document data. Fetching the entire document increases the size of the document payload by a lot, especially in collection queries when select() is used. Describe the solution you'd like There should be an option when calling get() on a query that would allow it to only return the data from the document.

Example:

// Proposed method
const userEmails =  await firestore.collection('Users').select('Email').get({
   dataOnly: true
});

should have the same result as:

// Current method
const users = await firestore.collection('Users').select('Email').get();
const userEmails = users.docs.map((doc) => doc.data());

Sep 15 '19 07:09 jakeleventhal

@jakeleventhal Thanks for the feature request! This is an interesting feature, but honestly not something that should expect in our clients soon. We currently retrieve the document names as part of the payload and use them internally as a key to various data structures. If possible, we would advise you to structure your data so that the names of the documents are only a small part of the total number of bytes transferred.

We can certainly look at other way to optimize our payloads (such a data compression, which should help if you have long collection names). If you have other suggestions or more data to let us know why your document names make up a significant portion of your data transfer, please do let us know.

Sep 16 '19 16:09 schmidt-sebastian

Sorry, I don't think I was clear. I was referring to the metadata associated with each document.

For instance, I created a sample document with the following data (there's nothing else there):

{ "SampleField": "asdfasdf", "OtherField": "lkjlkjlkj" }

With my proposed feature, I would be fetching a fairly small amount of data. Document names aren't a problem especially in comparison to the metadata that comes with a document. When fetching this document, I called

const user = await firestore.doc('Users/pdUXDTKLa0rC4eeJly2K').get();
console.log(JSON.stringify(user));

However, the console statement shows that MUCH more data is actually fetched than just the two fields:

{
  "_ref": {
    "_firestore": {
      "_settings": {
        "libName": "gccl",
        "libVersion": "2.3.0"
      },
      "_settingsFrozen": true,
      "_serializer": {
        "timestampsInSnapshots":
        true
      },
      "_projectId": "my-project",
      "_lastSuccessfulRequest": 1568688345220,
      "_preferTransactions": false,
      "_clientPool": {
        "concurrentOperationLimit":100,
        "activeClients": {}
      }
    },
    "_path": {
      "segments": ["Users","sampleid"],
      "projectId": "my-project",
      "databaseId": "(default)"
    }
  },
  "_fieldsProto": {
    "SampleField": {
      "stringValue": "asdfasdf",
      "valueType": "stringValue"
    },
    "OtherField": {
      "stringValue": "lkjlkjlkj",
      "valueType": "stringValue"
    }
  },
  "_serializer": {
    "timestampsInSnapshots": true
  },
  "_readTime": {
    "_seconds": 1568688345,
    "_nanoseconds": 183637000
  },
  "_createTime": {
    "_seconds": 1568688326,
    "_nanoseconds": 437609000
  },
  "_updateTime": {
    "_seconds": 1568688326,
    "_nanoseconds":437609000
  }
}

Sep 17 '19 02:09 jakeleventhal

Thank you for clarifying! Most of the data you show in your snippet is actually not fetched from the backend but is instance state that we manage in the client (everything under _ref and _serializer is client state). Furthermore, _readTime is not retrieved be document but rather just once for each query. I hope that makes you feel a bit better!

For reference, this is the actual data that we send for each document: https://github.com/googleapis/googleapis/blob/master/google/firestore/v1/document.proto#L37

There is certainly still some room for improvement, but unfortunately, it would be unwise for me promise that we will tackle this any time soon.

Sep 17 '19 16:09 schmidt-sebastian

Yes, and even if there is actually a 0% reduction in payload size, it is still a nice feature since it's annoying to have to call .data() on every doc.

Sep 19 '19 03:09 jakeleventhal

bump

Sep 14 '21 13:09 jakeleventhal

We would need some more user feedback before scheduling this work.

Sep 15 '21 22:09 schmidt-sebastian

nodejs-firestore nodejs-firestore copied to clipboard

Add option to only return document data

nodejs-firestore
nodejs-firestore copied to clipboard