sync_gateway
sync_gateway copied to clipboard
Refactor document code, and avoid [un]marshaling doc bodies
CBG-2951
Moved most of the Document, Revision and RevTree code, and a lot of Attachment stuff, out of the db package into a new document
package. (I'm not wild about that name, but doc
would conflict with the very common parameter name doc
, and docs
is already taken as a directory.) Two reasons for this:
- Creates a privacy boundary and lets these structs, esp.
Document
, protect their private fields from being groped by other code in the db package. There's a fair bit of this going on with the_body
and_rawBody
fields.) - The
db
package is really big and it would be nice to have smaller packages, IMHO, just for readability and code navigation.
To reduce the number of diffs, I created a file db/doc_model.go
that creates aliases for many of the types and constants that were moved. I also added an alias for the function ParseRevID
because there are so damn many calls to it.
To cope with the code in db that touched Document's _body
and _rawBody
fields, I added methods PeekBody
, PokeBody
, PeekRawBody
, PokeRawBody
. These can be removed after the code that calls them is fixed.
There are no functional changes in this PR; everything should behave exactly as before.
Pre-review checklist
- [x] Removed debug logging (
fmt.Print
,log.Print
, ...) - [x] Logging sensitive data? Make sure it's tagged (e.g.
base.UD(docID)
,base.MD(dbName)
) - [x] Updated relevant information in the API specifications (such as endpoint descriptions, schemas, ...) in
docs/api
Dependencies (if applicable)
- [x] Link upstream PRs
- [x] Update Go module dependencies when merged
Integration Tests
- [ ]
GSI=true,xattrs=true
https://jenkins.sgwdev.com/job/SyncGateway-Integration/000/
@snej You indicate that there are no functional changes in this PR - how does that apply to the changes around unmarshalling document bodies? Can you summarize the changes being made to that functionality?
@adamcfraser In this PR the document bodies are still unmarshaled when passed to JS (Otto). That will change with the V8 PR, at least when V8 is enabled; there we pass the JSON as a string and unmarshal it in JavaScript which is more efficient.
Otherwise, incoming doc bodies are still validated to make sure they don't contain illegal/reserved properties, but that's done without unmarshaling. The JSON does get parsed, but without converting it to Go objects which is the expensive part.
Created a perf testing task CBPS-1176 for this.
Perf results show a decent ~5% improvement in throughput (and accompanying reduction in CPU) - we should go ahead and review/merge based on this.