protocolbuffers.github.io icon indicating copy to clipboard operation
protocolbuffers.github.io copied to clipboard

repeated_ptr_field.h -- document lifetime of pointers and references

Open deustis opened this issue 9 months ago • 4 comments

In STL containers that store value elements (e.g. vector<Bar> as opposed to vector<Bar*>), element pointers are invalidated by operations such as adding or removing elements from the container.

The generated C++ code for repeated message fields returns a container of value elements: https://protobuf.dev/reference/cpp/cpp-generated/#repeatedmessage

const RepeatedPtrField<Bar>& bar()
RepeatedPtrField<Bar>* mutable_bar()

However, based on the "Ptr" in the name, and looking at the current implementation, it seems that the intention is for RepeatedPtrField to behave more like vector<Foo*>, where it is safe to hold on to pointers to the underlying messages.

For example, is this safe?:

message Bar {
  string baz = 1;
}
message Foo {
  repeated Bar bar = 1;
}
...

std::string do_something(Foo *foo) {
  // Get a reference to first element of bar
  const Bar& bar = *foo->bar()->begin();

  // Add a bar to foo
  foo->add_bar();

  // Do something with the first bar reference
  return bar.baz();
}

I'm not finding clear documentation on this behavior though. Assuming it is meant to be safe to hold on to these references/pointers, could this please be documented?

Related stack overflow making the same assumption: https://stackoverflow.com/questions/33219022/do-pointers-to-items-of-a-repeated-gpb-field-stay-valid-if-the-field-is-modified

deustis avatar Mar 17 '25 16:03 deustis

I am checking with the SWEs on the team to get clarity on this issue.

Logofile avatar Mar 20 '25 14:03 Logofile

I asked around on the team and unfortunately there is disagreement on what the answer should be to this question.

As of today, the implementation does preserve pointer stability (with the exception of some methods like Swap, AddAllocated, and ReleaseLast that may need to copy elements on/off/between arenas). But it is not clear whether we want to promise this guarantee. On one hand, there's probably a lot of code that already relies on this behavior and would be an obstacle to changing it. On the other hand, there may be some valuable optimizations we could make by breaking pointer stability.

So I think for now this is a bit of a grey area. You could rely on pointer stability and it will probably work at least in the short term, but there's no explicit guarantee and there's a possibility that it will break at some point in the future.

acozzette avatar Mar 24 '25 15:03 acozzette

I'm keeping this issue open until I have a chance to update the documentation with the information Adam shared.

Logofile avatar May 16 '25 21:05 Logofile

There's probably a lot of code that already relies on this behavior and would be an obstacle to changing it

Yes, I think it's safe to assume this would be a massively breaking change. :)

I'll watch for the updated documentation, thanks!

deustis avatar May 16 '25 21:05 deustis