vtprotobuf icon indicating copy to clipboard operation
vtprotobuf copied to clipboard

ResetVT does not work with oneof fields

Open codesoap opened this issue 1 year ago • 2 comments

Problem

When using pools, the ResetVT method is able to reuse allocated memory of bytes. It looks like this with a minimal example:

syntax = "proto2";
package TEST;
option go_package = "./testpkg";

message Foo {
	required bytes raw = 1;
}
func (m *Foo) ResetVT() {
	if m != nil {
		f0 := m.Raw[:0]
		m.Reset()
		m.Raw = f0
	}
}

When the bytes are inside a oneof field, the memory is no longer reused:

syntax = "proto2";
package TEST;
option go_package = "./testpkg";

message Foo {
	oneof data {
		bytes raw = 1;
		bytes zlib_data = 2;
	}
}
func (m *Foo) ResetVT() {
	if m != nil {
		m.Reset()
	}
}

Use Case

This is a real problem for me when parsing open streetmap's PBF files, because they use this construct for large blobs of data that occur hundreds to thousands of times in a PBF file: https://github.com/openstreetmap/OSM-binary/blob/65e7e976f5c8e47f057a0d921639ea8e6309ef06/osmpbf/fileformat.proto#L38

Maybe it would have been better if they used a single bytes field and a separate type field, but I'm afraid the design is set in stone as the format is already widely used.

Suggested Goal

Unfortunately I don't understand the code well enough to provide a pull request, but this is how I would imagine the generated code to look like in order to fix the problem:

func (m *Foo) ResetVT() {
	if m != nil {
		var f0 isFoo_Data
		switch t := m.Data.(type) {
		case *Foo_Raw:
			t.Raw = t.Raw[:0]
			f0 = t
		case *Foo_ZlibData:
			t.ZlibData = t.ZlibData[:0]
			f0 = t
		}
		m.Reset()
		m.Data = f0
	}
}

Inside the UnmarshalVT method, the code would look something like this; As a first step I'm only reusing the memory, if the Data field previously had the same type:

...
			var v []byte
			if m.Data == nil {
				v = make([]byte, postIndex-iNdEx)
			} else {
				switch t := m.Data.(type) {
				case *Foo_Raw:
					if cap(t.Raw) < postIndex-iNdEx {
						v = make([]byte, postIndex-iNdEx)
					} else {
						v = t.Raw[:postIndex-iNdEx]
					}
				default:
					v = make([]byte, postIndex-iNdEx)
				}
			}
			copy(v, dAtA[iNdEx:postIndex])
			m.Data = &Foo_Raw{Raw: v}
...

codesoap avatar Oct 03 '24 09:10 codesoap

This suggested generated code seems reasonable to me, but I do feel like it's gonna be tough to implement. We can merge this improvement if you get it working, and I think you're in the right direction. The right way to tackle this is starting with the generated code you're expecting to see and then tweaking the codegen until you get the desired results. :)

vmg avatar Oct 07 '24 13:10 vmg

I'm afraid my pain is not high enough to attempt implementing the feature at the moment. Maybe I will find more motivation in the future. I'll leave it to you to close this issue for now or keep it open.

codesoap avatar Oct 20 '24 06:10 codesoap