Optimized Python NumPy Bindings for Repeated Fields
What language does this apply to? Python
Describe the problem you are trying to solve. Fast Protobuf to numpy array conversions
Describe the solution you'd like
We have an internal design proposal approved for implementing __array__ for fast protobuf -> NumPy conversion. This was also requested in https://groups.google.com/g/protobuf/c/P7-5iqUZa5s.
Filing this for tracking of status/timeline, since we've received several requests related to this from open-source customers.
Describe alternatives you've considered This was requested in https://github.com/protocolbuffers/protobuf/pull/8047 via implementing buffer protocol in repeated primitive fields. However, we don't want to implement the buffer protocol because it exposes more of our internal implementation, and may cause problems if the underlying repeated field is modified while a view to it exists.
@zhangskz - Any updates on this issue? Did the approved internal design ever move towards implementation?
@zhangskz - I'm also interested in knowing if there have been any updates on this issue.
The original proposer had an implementation in progress, but I believe progress here stalled. We haven't implemented this further on our side yet, but will see what we can do to get this prioritized given the demand.
For unrelated reasons, I happened to find my way to https://docs.python.org/3.12/whatsnew/3.12.html#pep-688-making-the-buffer-protocol-accessible-in-python which is new in the upcoming Python 12. I wonder if it might play a role in solving this problem.
Thanks for sharing! We had previously considered `buffer()`` in https://github.com/protocolbuffers/protobuf/pull/8047#issuecomment-1032920157 but buffer protocol exposes more of our internal implementation and may cause problems if the underlying repeated field is modified while a view to it exists.
Sure, I definitely understand about not wanting to leak implementation details, and can't really speak to how that might be mitigated. On the second part though, I'd say that's a risk worth taking, or perhaps better phrased as a risk worth permitting the user to take. In the case where you really need zero-copy behavior into numpy, that's unavoidably going to imply view semantics one way or another, and by opting in the user agrees to play by more subtle (though hopefully well documented) rules. Just my thoughts on the issue - I'll be interested to see any future solution. I think it is an important one.
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.
This issue is labeled inactive because the last activity was over 90 days ago.
@zhangskz - The notification that this was becoming inactive brought it back to my attention. Has there been any further research into potential solutions? This does feel like a worthwhile improvement.
@zhangskz are there any updates?
Unfortunately we hadn't had capacity to staff this yet, but this is on our radar.
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.
This issue is labeled inactive because the last activity was over 90 days ago.
This issue should remain active.
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.
This issue is labeled inactive because the last activity was over 90 days ago. This issue will be closed and archived after 14 additional days without activity.
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please reopen it.
This issue was closed and archived because there has been no new activity in the 14 days since the inactive label was added.
please have this open
This should not be closed, please.
@zhangskz - Can we get this reopened somehow please?
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.
This issue is labeled inactive because the last activity was over 90 days ago. This issue will be closed and archived after 14 additional days without activity.
We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please reopen it.
This issue was closed and archived because there has been no new activity in the 14 days since the inactive label was added.
@zhangskz - Can we get this reopened please? This is too important a feature to allow it to just get disappeared every few months. Is there a way to exclude it from the bot?
The Protobuf team has prioritized this feature request and will begin work on it soon.
Any updates on this? @anandolee @zhangskz @jguamie