Protobuf performance

Open CurleySamuel opened this issue 10 years ago • 2 comments

At least for large scans that I've profiled I've found that a lot of the time spent in the client is demarshaling the pb response from HBase.

There may be room to switch to a faster pb library or to a pb C-binding to utilize other cores.

Aug 21 '15 22:08 CurleySamuel

Some quick research shows that the protobuf compiler supports creating native code for python. See http://yz.mit.edu/wp/fast-native-c-protocol-buffers-from-python/ for a description of the usage and https://developers.google.com/protocol-buffers/docs/reference/python-generated?hl=en#cpp_impl for Google's docs on it.

Note: its an experimental feature as of July 10, 2015 and would break compatibility with python implementations other than cpython

Sep 26 '15 22:09 abrandemuehl

I vote we compile both native Python and a C++ implementation for our PBs. Switch between the two depending on either config or environment variables (we could possibly detect if they're running on CPython or a platform which doesn't support C modules)

Sep 26 '15 23:09 CurleySamuel