sync_gateway icon indicating copy to clipboard operation
sync_gateway copied to clipboard

Refactor document code, and avoid [un]marshaling doc bodies

Open snej opened this issue 1 year ago • 4 comments

CBG-2951

Moved most of the Document, Revision and RevTree code, and a lot of Attachment stuff, out of the db package into a new document package. (I'm not wild about that name, but doc would conflict with the very common parameter name doc, and docs is already taken as a directory.) Two reasons for this:

  1. Creates a privacy boundary and lets these structs, esp. Document, protect their private fields from being groped by other code in the db package. There's a fair bit of this going on with the _body and _rawBody fields.)
  2. The db package is really big and it would be nice to have smaller packages, IMHO, just for readability and code navigation.

To reduce the number of diffs, I created a file db/doc_model.go that creates aliases for many of the types and constants that were moved. I also added an alias for the function ParseRevID because there are so damn many calls to it.

To cope with the code in db that touched Document's _body and _rawBody fields, I added methods PeekBody, PokeBody, PeekRawBody, PokeRawBody. These can be removed after the code that calls them is fixed.

There are no functional changes in this PR; everything should behave exactly as before.

Pre-review checklist

  • [x] Removed debug logging (fmt.Print, log.Print, ...)
  • [x] Logging sensitive data? Make sure it's tagged (e.g. base.UD(docID), base.MD(dbName))
  • [x] Updated relevant information in the API specifications (such as endpoint descriptions, schemas, ...) in docs/api

Dependencies (if applicable)

  • [x] Link upstream PRs
  • [x] Update Go module dependencies when merged

Integration Tests

  • [ ] GSI=true,xattrs=true https://jenkins.sgwdev.com/job/SyncGateway-Integration/000/

snej avatar Feb 22 '23 23:02 snej

@snej You indicate that there are no functional changes in this PR - how does that apply to the changes around unmarshalling document bodies? Can you summarize the changes being made to that functionality?

adamcfraser avatar May 16 '23 22:05 adamcfraser

@adamcfraser In this PR the document bodies are still unmarshaled when passed to JS (Otto). That will change with the V8 PR, at least when V8 is enabled; there we pass the JSON as a string and unmarshal it in JavaScript which is more efficient.

Otherwise, incoming doc bodies are still validated to make sure they don't contain illegal/reserved properties, but that's done without unmarshaling. The JSON does get parsed, but without converting it to Go objects which is the expensive part.

snej avatar May 16 '23 22:05 snej

Created a perf testing task CBPS-1176 for this.

snej avatar Jun 05 '23 17:06 snej

Perf results show a decent ~5% improvement in throughput (and accompanying reduction in CPU) - we should go ahead and review/merge based on this.

adamcfraser avatar Jul 11 '23 17:07 adamcfraser