nodejs-firestore
nodejs-firestore copied to clipboard
Add option to only return document data
Is your feature request related to a problem? Please describe.
Often times, it is known ahead of time that the document reference will not be needed and that all that is needed is the document data. Fetching the entire document increases the size of the document payload by a lot, especially in collection queries when select() is used.
Describe the solution you'd like
There should be an option when calling get() on a query that would allow it to only return the data from the document.
Example:
// Proposed method
const userEmails = await firestore.collection('Users').select('Email').get({
dataOnly: true
});
should have the same result as:
// Current method
const users = await firestore.collection('Users').select('Email').get();
const userEmails = users.docs.map((doc) => doc.data());
@jakeleventhal Thanks for the feature request! This is an interesting feature, but honestly not something that should expect in our clients soon. We currently retrieve the document names as part of the payload and use them internally as a key to various data structures. If possible, we would advise you to structure your data so that the names of the documents are only a small part of the total number of bytes transferred.
We can certainly look at other way to optimize our payloads (such a data compression, which should help if you have long collection names). If you have other suggestions or more data to let us know why your document names make up a significant portion of your data transfer, please do let us know.
Sorry, I don't think I was clear. I was referring to the metadata associated with each document.
For instance, I created a sample document with the following data (there's nothing else there):
{ "SampleField": "asdfasdf", "OtherField": "lkjlkjlkj" }
With my proposed feature, I would be fetching a fairly small amount of data. Document names aren't a problem especially in comparison to the metadata that comes with a document. When fetching this document, I called
const user = await firestore.doc('Users/pdUXDTKLa0rC4eeJly2K').get();
console.log(JSON.stringify(user));
However, the console statement shows that MUCH more data is actually fetched than just the two fields:
{
"_ref": {
"_firestore": {
"_settings": {
"libName": "gccl",
"libVersion": "2.3.0"
},
"_settingsFrozen": true,
"_serializer": {
"timestampsInSnapshots":
true
},
"_projectId": "my-project",
"_lastSuccessfulRequest": 1568688345220,
"_preferTransactions": false,
"_clientPool": {
"concurrentOperationLimit":100,
"activeClients": {}
}
},
"_path": {
"segments": ["Users","sampleid"],
"projectId": "my-project",
"databaseId": "(default)"
}
},
"_fieldsProto": {
"SampleField": {
"stringValue": "asdfasdf",
"valueType": "stringValue"
},
"OtherField": {
"stringValue": "lkjlkjlkj",
"valueType": "stringValue"
}
},
"_serializer": {
"timestampsInSnapshots": true
},
"_readTime": {
"_seconds": 1568688345,
"_nanoseconds": 183637000
},
"_createTime": {
"_seconds": 1568688326,
"_nanoseconds": 437609000
},
"_updateTime": {
"_seconds": 1568688326,
"_nanoseconds":437609000
}
}
Thank you for clarifying! Most of the data you show in your snippet is actually not fetched from the backend but is instance state that we manage in the client (everything under _ref and _serializer is client state). Furthermore, _readTime is not retrieved be document but rather just once for each query. I hope that makes you feel a bit better!
For reference, this is the actual data that we send for each document: https://github.com/googleapis/googleapis/blob/master/google/firestore/v1/document.proto#L37
There is certainly still some room for improvement, but unfortunately, it would be unwise for me promise that we will tackle this any time soon.
Yes, and even if there is actually a 0% reduction in payload size, it is still a nice feature since it's annoying to have to call .data() on every doc.
bump
We would need some more user feedback before scheduling this work.