pydm
pydm copied to clipboard
FIX: Account for custom scalar types of incoming values
We are using JPype, which translates Java objects to Python. Recent versions of this library convert Java int
, double
and similar to JInt
, JDouble
, which are custom subclasses of corresponding Python types.
Now, when such value arrives on a channel, it results in broken logic, once its type is recorded with self.channeltype = type(value)
, namely:
- Checks such as
self.channeltype == float
will evaluate toFalse
, whenJDouble
is in play - PyQt signals may break, e.g.
self.send_value_signal[self.channeltype].emit()
will enumerate through all available subscripts, not finding an overload forJDouble
, it will pick the last available, i.e.np.ndarray
. The resulting exception is very confusing:
TypeError: PyDMWritableWidget.send_value_signal[numpy.ndarray].emit(): argument 1 has unexpected type 'float'
This PR tries to record self.channeltype
as base types, where appropriate.
Codecov Report
Merging #737 (d37a975) into master (00974e9) will decrease coverage by
1.77%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #737 +/- ##
==========================================
- Coverage 58.65% 56.87% -1.78%
==========================================
Files 88 88
Lines 9984 9550 -434
==========================================
- Hits 5856 5432 -424
+ Misses 4128 4118 -10
Impacted Files | Coverage Δ | |
---|---|---|
pydm/widgets/base.py | 89.02% <100.00%> (-1.52%) |
:arrow_down: |
pydm/widgets/tab_bar.py | 33.54% <0.00%> (-8.77%) |
:arrow_down: |
pydm/widgets/image.py | 26.53% <0.00%> (-8.06%) |
:arrow_down: |
pydm/widgets/scale.py | 19.42% <0.00%> (-5.81%) |
:arrow_down: |
pydm/widgets/waveformtable.py | 20.00% <0.00%> (-5.00%) |
:arrow_down: |
pydm/widgets/embedded_display.py | 21.56% <0.00%> (-4.82%) |
:arrow_down: |
...ydm/connection_inspector/connection_table_model.py | 25.00% <0.00%> (-3.82%) |
:arrow_down: |
pydm/widgets/waveformplot.py | 22.79% <0.00%> (-3.81%) |
:arrow_down: |
pydm/widgets/scatterplot.py | 25.00% <0.00%> (-3.58%) |
:arrow_down: |
pydm/widgets/enum_button.py | 71.14% <0.00%> (-3.42%) |
:arrow_down: |
... and 37 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 00974e9...d37a975. Read the comment docs.
Wouldn't it be best to address this conversion over at the data plugin and let PyDM deal with the basic types? You could handle that into your plugin when JPype give you the value before emitting and when you receive the write you can cast it back to the proper type via a reverse lookup dictionary.
Wouldn't it be best to address this conversion over at the data plugin and let PyDM deal with the basic types? You could handle that into your plugin when JPype give you the value before emitting and when you receive the write you can cast it back to the proper type via a reverse lookup dictionary.
That depends on the perspective. If there were multiple plugins based on JPype, each of them would need to handle that separately.
@hhslepicka , what would be an acceptable performance hit in this method?
Compared to original PyDM code, I'm observing 2-8% slowdown in @Thrameos 's hashmap solution, which is by far the most performant.
This is perf on my machine with various values for 100000 value_changed
calls each (averaged over 5 attempts):
Value | 123 | 123.47 | 1.00E+02 | 0x1FF | 0b100 | TRUE | FALSE | np.array([123, 456]) | "test" | FloatSubclass(0.5) | StrSubclass("test") |
---|---|---|---|---|---|---|---|---|---|---|---|
Orig Avg (ms) | 1,282,931.51 | 1,322,600.19 | 1,288,669.65 | 1,350,867.69 | 1,318,794.36 | 1,309,725.72 | 1,323,938.08 | 4,508,809.83 | 954,742.51 | 1,371,229.86 | 955,796.24 |
HashMap Avg (ms) | 1,351,577.09 | 1,361,390.72 | 1,358,056.51 | 1,357,154.96 | 1,350,078.21 | 1,373,699.45 | 1,375,790.97 | 4,567,240.99 | 1,002,908.61 | 1,420,907.02 | 1,041,290.06 |
Increase (%) | 5.351% | 2.933% | 5.384% | 0.465% | 2.372% | 4.885% | 3.917% | 1.296% | 5.045% | 3.623% | 8.945% |
@ivany4 Perhaps you can make up for the difference by removing the logic
try:
if self.channeltype == unicode:
# For Python 2.7, set the the channel type to str instead of unicode
self.channeltype = str
except NameError:
pass
Just add unicode
to the cache mechanism as a prepopulated.
Hi @ivany4, sorry for the silence. @hhslepicka left SLAC, and I have only been able to spend tiny amounts of time on PyDM lately.
I am still against including this in the base widget, I agree with @hhslepicka - this seems like it should happen at the plugin layer. I would guess you could have a common base class for a JPype-based plugin, and implement the conversion to a base type there.
Your most recent implementation seems reasonable from a performance standpoint, but I'm objecting for architectural reasons. If there's a strong reason why this can't be implemented in the data plugin, I'd be willing to merge.
Hi @ivany4, sorry for the silence. @hhslepicka left SLAC, and I have only been able to spend tiny amounts of time on PyDM lately.
I am still against including this in the base widget, I agree with @hhslepicka - this seems like it should happen at the plugin layer. I would guess you could have a common base class for a JPype-based plugin, and implement the conversion to a base type there.
Your most recent implementation seems reasonable from a performance standpoint, but I'm objecting for architectural reasons. If there's a strong reason why this can't be implemented in the data plugin, I'd be willing to merge.
Thanks for the feedback, @mattgibbs . Sad to see @hhslepicka leave :/ I understand your concern, and I would be happy to find another elegant solution (throw more ideas, if you have them). I was trying to avoid implementing conversion in the plugin for few reasons:
- It means that incoming data will be copied into a native Python type, which sounds like a big performance hit to me (not measured though), because every incoming piece of data will be copied. My initial approach minimized the impact by playing with
channeltype
, which is used less frequently, mainly for writing (but unfortunately is reset on every incoming value). However I think using hash tables before updatingchanneltype
should still be more performant than copying incoming data every single time. - PyDM implementation implies that native Python types cannot be subclassed, which may impact more than just JPype derivatives, if some other system takes the same subclassing approach. If you think this is perfectly valid solution, it's worth documenting that.
I'm currently researching the possibility to monkey-patch __eq__
for JPype types so that e.g. type(self.channeltype) == bool
could succeed for JBoolean
without the need for additional logic on PyDM level. Would be curious to hear @Thrameos 's opinion how bad of an idea that is. It is still insufficient though for cases like self.send_value_signal[self.channeltype].emit(num_value)
.
Monkey patching the equals for a type will undoubtedly create unexpected side effects in users code. I can't say what is would break only that I would certainly not add such a hack to JPype release nor recommend hacking to add a bad contract just to fix another libraries code. JPype is not doing anything special here as other toolkits such as numpy as doing the same. The existing code is special cases here for "np.array" and "unicode" already but I could see the exact same problem happen of "np.float32" or other external types. So this is not really a JPype issue by rather an issue with how pydm is using the type information. Using the "type" as a look up for an action as is being done here is a recipe for disaster as it violates the concept of duck typing. One should be able to define a type that meets the specification (or in this case merely derives from a base type) without breaking code.
I would recommend that the hash solution with a plugin function for constructing the channel type would be the best solution. The user has the option of controlling the channel type by installing a function and the hash solution would be faster than the existing logic as it already has several special cases. Placing a user defined plug in so that they can add a type conversion with known rules is better than requiring a monkey patch.
lets consider the original code
def value_changed(self, new_val):
"""
Callback invoked when the Channel value is changed.
Parameters
----------
new_val : str, int, float, bool or np.ndarray
The new value from the channel. The type depends on the channel.
"""
self.value = new_val
self.channeltype = type(self.value)
if self.channeltype == np.ndarray:
self.subtype = self.value.dtype.type
else:
try:
if self.channeltype == unicode:
# For Python 2.7, set the the channel type to str instead of unicode
self.channeltype = str
except NameError:
pass
In this the type of the channel is doing 2 special cases one of which will generate an exception which is clearly not a very great performance. Also any derived types for np.ndarray will also miss the if clause. So there is already problems with the existing code.
The alternative would be.
CHANNEL_CACHE = {}
CHANNEL_HANDLERS = ()
# Make unicode go to str if applicable
try:
CHANNEL_CACHE[unicode] = str
except NameError:
pass
# Define a hook for the user to handle derived types gracefully.
def add_channel_handler(method):
HANDLERS.append(method)
...
def value_changed(self, new_val):
"""
Callback invoked when the Channel value is changed.
Parameters
----------
new_val : str, int, float, bool or np.ndarray
The new value from the channel. The type depends on the channel.
"""
self.value = new_val
requested = type(self.value)
self.channeltype = CHANNEL_CACHE.get(requested, None)
# If this is a new type we need to search for it in the handlers
if self.channeltype is None:
self.channeltype = requested
# first handler that knows the type will be used
for handler in CHANNEL_HANDLERS:
tp = handler(requested)
if tp is not None:
self.channeltype = tp
break
# cache so we never hit the lookup for this type again
CHANNEL_CACHE[requested] = self.channeltype
# Equals may be faster than isinstance though this will break for derived types.
if isinstance(self.channeltype, np.ndarray):
self.subtype = self.value.dtype.type
Now the user can install whatever hooks they want to ensure that derived types chose the appropriate base. This doesn't force any solution and is not particular to JPype nor is magical for ints or floats. The user has to install a handler to get the alternative behavior.