pbf icon indicating copy to clipboard operation
pbf copied to clipboard

A more user-friendly API

Open download13 opened this issue 7 years ago • 6 comments

Is anyone here opposed to maybe adding a more user-friendly API to this library.

Right now it takes a fair amount of boilerplate to do what is essentially JSON.parse(encoded) or JSON.stringify(obj).

Plus, there doesn't seem to be a way to get the same objects out on the decode end as were put into the encoding step. This means even more boilerplate code to translate certain fields from numbers into enum strings. Considering this is probably the most common use case it seems worth it to make this a simple one-step call like buffer = encode(schema, obj) and obj = decode(schema, buffer). And preferably, the input to encode and the output of decode should have deep equality.

download13 avatar Jan 10 '18 01:01 download13

@download13 Thanks for sharing your thoughts! This library has a focus on performance and makes sacrifices in other areas such as usability. However, I agree that there is definitely room for improvement in ergonomics! I'll try to add my thoughts one at a time.

it seems worth it to make this a simple one-step call like buffer = encode(schema, obj)

One of the things this library does to make things fast is generate code for serialization instead of using a generic provider. This allows the JS engine to perform optimizations such as inlining and limits the amount of branching.

Unfortunately, compiling is quite expensive and shouldn't be performed on every call. This makes your version of encode and decode difficult to achieve.

However, there is probably some improvement that could be made to simplify write calls. Currently, pbf objects are passed in to allow object pooling of the buffers. If we added a default value and returned pbf from write calls (allowing chaining) encoding could be much more simple.

Type.write = function (obj, pbf = new Pbf()) {
	/* normal generated code */

	return pbf;
};

const encoded = Type.encode(obj).finish();

This means even more boilerplate code to translate certain fields from numbers into enum strings

This is definitely one of the sources of the most boilerplate and it would be great to translate these on the way in / out. We would probably want this optional to prevent the branching and string allocations for use cases that do not need it.

And preferably, the input to encode and the output of decode should have deep equality.

The primary reason for this is defaults as defined by the protobuf standard. This is also mentioned in https://github.com/mapbox/pbf/issues/80. If compiling had a flag to opt out of default values, this could be achieved.

Lastly, if you are looking to simplify the compilation process, take a look at pbf-loader for webpack.

kjvalencik avatar Jan 10 '18 16:01 kjvalencik

I should've been more clear with the buffer = encode(schema, obj) thing. I meant schema to be the compiled form (Type) but I see how that was misleading. I promise I wasn't suggesting re-compilation on every call :P

If Pbf were to be referenced inside the compiled type it seems like that would require importing the pbf module by some means. Some other protobuf libraries have dealt with this by having compilation options that let you specify how (or if) you want to importing to be done (eg require/es6imports/globalvar/amd).

This would probably add a little bit of complexity to the compile code (though that may not matter as much since compile.js probably isn't being bundled and sent to the client in most cases).

The simplest way I can see of getting one-call encoding and decoding is to just add the following code to index.js:

Pbf.encode = function(Type, obj) {
  var pbf = new Pbf();
  Type.write(obj, pbf);
  return pbf.finish();
};

Pbf.decode = function(Type, buffer) {
  return Type.read(new Pbf(buffer));
};

This wouldn't affect the compile step at all, and wouldn't break anything that people are already using. It also only adds about 40 bytes to index.js when minified and gzipped. Then for anyone who doesn't care about doing custom encoding/decoding it's as simple as:

const {encode, decode} = require('pbf');
const {Example} = require('./example.js');

var obj = {...};
var buffer = encode(Example, obj);

I think I have an idea for how to add enum translation pretty cheap, but I'll leave it as a separate comment for the sake of organization.

download13 avatar Jan 10 '18 21:01 download13

Looking through the code for writing an object into a buffer I can't actually find the part where enum translation is done.

I ran some sample code through a debugger and it appears that the enum strings in the input object are never translated into their integer counterparts. It looks like they get passed directly to writeVarint which just changes them to 0 if they're not already a number.

Am I missing something? It looks like every input enum value is just turned directly into the default value for that enum (0).

Are input objects supposed to be manually translated as well? I was under the impression it was only output objects that needed to have the strings added back to them.

download13 avatar Jan 10 '18 23:01 download13

Are input objects supposed to be manually translated as well?

Yes. Enums are only ever treated as int in pbf. This is contrary to what protobufjs does.

kjvalencik avatar Jan 10 '18 23:01 kjvalencik

Ah okay. So those enum lookup tables in the generated code are purely for the benefit of the programmer and not used by some other existing code. That makes more sense.

download13 avatar Jan 10 '18 23:01 download13

Yep, so you can write something like:

switch (msg.type) {
    case Tile.GeomType.POINT:
    case Tile.GeomType.LINESTRING:
}

kjvalencik avatar Jan 10 '18 23:01 kjvalencik