cozy-data-system
cozy-data-system copied to clipboard
Bulk update
I see there is an end-point for bulk delete, but not for bulk update. However, the db_remove_helper suggests that it shouldn't be too different to do (as a deletion seems to be actually an update, iiuc). Is there any way we could have a bulk update end-point, please?
Here's an API i imagine:
PUT /request/:type/:req_name/update
Param:
type: the doctype name
req_name: the name of the request
Body {
key: only returns document for this key
keys: [only returns document for this array of keys]
limit: number of documents to return
skip: number of documents to skip
startKey: only returns document after this key
endKey: only returns document before this key
update: [[key, value], [key, value]] or [key, value]
}
The body is fully optional.
For instance, with
body = {
update: ['date', '1234']
}
Then the 'date' field of all values should be updated to '1234'.
body = {
update: [['date', '1234'], ['bankAccount', '4321']]
}
Then the 'date' field of all values is updated to 1234, and the 'bankAccount' field of all values is updated to 4321.
Not sure about the naming of update
, maybe values
or replaceBy
would be more adequate.
Having the update as a map would be easier I think :
PUT /request/file/all/byPath/update/
{
startkey: "/some/path/"
endkey: "/some/path/ZZZZ"
update: {
lastModification: 3554232,
author: "John"
}
}
Or, more powerful (and similar performance: 1 request app->DS, 2X request DS->Couch) a function
PUT /request/file/all/byPath/update/
{
startkey: "/some/path/"
endkey: "/some/path/ZZZZ"
update: "function(doc){
doc.tags = doc.tags.map(function(tag){
if(tag === 'old-name') return 'new-name';
else return tag;
}
return doc
}"
}
I agree this would be convenient and might improve performance of some cozy apps. Not sure when we will have the time to do it
Yes, a map would indeed be better (I wonder how I could have missed this). I'll try to add this feature if I get some time soon, this could show useful for Kresus.
:+1: This would be useful to import larger time series (for example, 15 years of currency data for ~10 different currencies is north of 25000 values).
EDIT: Actually, bulk insert would be useful for that, which also shouldn't be very different from bulk delete / update.
FYI, according to this post about couchdb performance:
- Inserting docs one-by-one happens at
~260 docs/sec
, - Inserting docs in bulk (groups of 10000 docs) happens at
~2700 docs/sec
, - With even more optimizations (e.g. bypass HTTP and JSON conversions) it can go higher than
10500 docs/sec
.
@jankeromnes , One thing to keep in mind is that cozy has a lot of views in couchdb. So adding many documents slow down the whole cozy. We already had some troubles with emails when importing large accounts.
So for your use case of currency, (ie. never changing values). The performance-optimal solution would be to have huge documents with, say, all data for a month and then using views to produce the index you need
function(doc){
if (doc.docType == 'MonthOfCurrencyExchange')
for (date in doc.dates)
for (currency in date.currencies)
emit(date, currency, value);
Interesting idea, thanks @aenario! Actually it might even be faster to store all my values in one single document, since the number of documents seems to be the bottleneck. I'll try different options.