We are using JPype, which translates Java objects to Python. Recent versions of this library convert Java int, double and similar to JInt, JDouble, which are custom subclasses of corresponding Python types.

Now, when such value arrives on a channel, it results in broken logic, once its type is recorded with self.channeltype = type(value), namely:

Checks such as self.channeltype == float will evaluate to False, when JDouble is in play
PyQt signals may break, e.g. self.send_value_signal[self.channeltype].emit() will enumerate through all available subscripts, not finding an overload for JDouble, it will pick the last available, i.e. np.ndarray. The resulting exception is very confusing:

TypeError: PyDMWritableWidget.send_value_signal[numpy.ndarray].emit(): argument 1 has unexpected type 'float'

This PR tries to record self.channeltype as base types, where appropriate.

Jun 09 '21 13:06 ivany4

Codecov Report

Merging #737 (d37a975) into master (00974e9) will decrease coverage by 1.77%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #737      +/-   ##
==========================================
- Coverage   58.65%   56.87%   -1.78%     
==========================================
  Files          88       88              
  Lines        9984     9550     -434     
==========================================
- Hits         5856     5432     -424     
+ Misses       4128     4118      -10

Impacted Files	Coverage Δ
pydm/widgets/base.py	`89.02% <100.00%> (-1.52%)`	:arrow_down:
pydm/widgets/tab_bar.py	`33.54% <0.00%> (-8.77%)`	:arrow_down:
pydm/widgets/image.py	`26.53% <0.00%> (-8.06%)`	:arrow_down:
pydm/widgets/scale.py	`19.42% <0.00%> (-5.81%)`	:arrow_down:
pydm/widgets/waveformtable.py	`20.00% <0.00%> (-5.00%)`	:arrow_down:
pydm/widgets/embedded_display.py	`21.56% <0.00%> (-4.82%)`	:arrow_down:
...ydm/connection_inspector/connection_table_model.py	`25.00% <0.00%> (-3.82%)`	:arrow_down:
pydm/widgets/waveformplot.py	`22.79% <0.00%> (-3.81%)`	:arrow_down:
pydm/widgets/scatterplot.py	`25.00% <0.00%> (-3.58%)`	:arrow_down:
pydm/widgets/enum_button.py	`71.14% <0.00%> (-3.42%)`	:arrow_down:
... and 37 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 00974e9...d37a975. Read the comment docs.

Jun 09 '21 13:06 codecov-commenter

Wouldn't it be best to address this conversion over at the data plugin and let PyDM deal with the basic types? You could handle that into your plugin when JPype give you the value before emitting and when you receive the write you can cast it back to the proper type via a reverse lookup dictionary.

Jun 10 '21 04:06 hhslepicka

Wouldn't it be best to address this conversion over at the data plugin and let PyDM deal with the basic types? You could handle that into your plugin when JPype give you the value before emitting and when you receive the write you can cast it back to the proper type via a reverse lookup dictionary.

That depends on the perspective. If there were multiple plugins based on JPype, each of them would need to handle that separately.

Jun 10 '21 08:06 ivany4

@hhslepicka , what would be an acceptable performance hit in this method? Compared to original PyDM code, I'm observing 2-8% slowdown in @Thrameos 's hashmap solution, which is by far the most performant. This is perf on my machine with various values for 100000 value_changed calls each (averaged over 5 attempts):

Value	123	123.47	1.00E+02	0x1FF	0b100	TRUE	FALSE	np.array([123, 456])	"test"	FloatSubclass(0.5)	StrSubclass("test")
Orig Avg (ms)	1,282,931.51	1,322,600.19	1,288,669.65	1,350,867.69	1,318,794.36	1,309,725.72	1,323,938.08	4,508,809.83	954,742.51	1,371,229.86	955,796.24
HashMap Avg (ms)	1,351,577.09	1,361,390.72	1,358,056.51	1,357,154.96	1,350,078.21	1,373,699.45	1,375,790.97	4,567,240.99	1,002,908.61	1,420,907.02	1,041,290.06
Increase (%)	5.351%	2.933%	5.384%	0.465%	2.372%	4.885%	3.917%	1.296%	5.045%	3.623%	8.945%

Jul 05 '21 09:07 ivany4

@ivany4 Perhaps you can make up for the difference by removing the logic

            try:
                if self.channeltype == unicode:
                    # For Python 2.7, set the the channel type to str instead of unicode
                    self.channeltype = str
            except NameError:
                pass

Just add unicode to the cache mechanism as a prepopulated.

Jul 06 '21 16:07 Thrameos

Hi @ivany4, sorry for the silence. @hhslepicka left SLAC, and I have only been able to spend tiny amounts of time on PyDM lately.

I am still against including this in the base widget, I agree with @hhslepicka - this seems like it should happen at the plugin layer. I would guess you could have a common base class for a JPype-based plugin, and implement the conversion to a base type there.

Your most recent implementation seems reasonable from a performance standpoint, but I'm objecting for architectural reasons. If there's a strong reason why this can't be implemented in the data plugin, I'd be willing to merge.

Jul 13 '21 00:07 mattgibbs

Hi @ivany4, sorry for the silence. @hhslepicka left SLAC, and I have only been able to spend tiny amounts of time on PyDM lately.

I am still against including this in the base widget, I agree with @hhslepicka - this seems like it should happen at the plugin layer. I would guess you could have a common base class for a JPype-based plugin, and implement the conversion to a base type there.

Your most recent implementation seems reasonable from a performance standpoint, but I'm objecting for architectural reasons. If there's a strong reason why this can't be implemented in the data plugin, I'd be willing to merge.

Thanks for the feedback, @mattgibbs . Sad to see @hhslepicka leave :/ I understand your concern, and I would be happy to find another elegant solution (throw more ideas, if you have them). I was trying to avoid implementing conversion in the plugin for few reasons:

It means that incoming data will be copied into a native Python type, which sounds like a big performance hit to me (not measured though), because every incoming piece of data will be copied. My initial approach minimized the impact by playing with channeltype, which is used less frequently, mainly for writing (but unfortunately is reset on every incoming value). However I think using hash tables before updating channeltype should still be more performant than copying incoming data every single time.
PyDM implementation implies that native Python types cannot be subclassed, which may impact more than just JPype derivatives, if some other system takes the same subclassing approach. If you think this is perfectly valid solution, it's worth documenting that.

I'm currently researching the possibility to monkey-patch __eq__ for JPype types so that e.g. type(self.channeltype) == bool could succeed for JBoolean without the need for additional logic on PyDM level. Would be curious to hear @Thrameos 's opinion how bad of an idea that is. It is still insufficient though for cases like self.send_value_signal[self.channeltype].emit(num_value).

Jul 19 '21 12:07 ivany4

Monkey patching the equals for a type will undoubtedly create unexpected side effects in users code. I can't say what is would break only that I would certainly not add such a hack to JPype release nor recommend hacking to add a bad contract just to fix another libraries code. JPype is not doing anything special here as other toolkits such as numpy as doing the same. The existing code is special cases here for "np.array" and "unicode" already but I could see the exact same problem happen of "np.float32" or other external types. So this is not really a JPype issue by rather an issue with how pydm is using the type information. Using the "type" as a look up for an action as is being done here is a recipe for disaster as it violates the concept of duck typing. One should be able to define a type that meets the specification (or in this case merely derives from a base type) without breaking code.

I would recommend that the hash solution with a plugin function for constructing the channel type would be the best solution. The user has the option of controlling the channel type by installing a function and the hash solution would be faster than the existing logic as it already has several special cases. Placing a user defined plug in so that they can add a type conversion with known rules is better than requiring a monkey patch.

lets consider the original code

   def value_changed(self, new_val):
        """
        Callback invoked when the Channel value is changed.
        Parameters
        ----------
        new_val : str, int, float, bool or np.ndarray
            The new value from the channel. The type depends on the channel.
        """
        self.value = new_val
        self.channeltype = type(self.value)
        if self.channeltype == np.ndarray:
            self.subtype = self.value.dtype.type
        else:
            try:
                if self.channeltype == unicode:
                    # For Python 2.7, set the the channel type to str instead of unicode
                    self.channeltype = str
            except NameError:
                pass

In this the type of the channel is doing 2 special cases one of which will generate an exception which is clearly not a very great performance. Also any derived types for np.ndarray will also miss the if clause. So there is already problems with the existing code.

The alternative would be.

CHANNEL_CACHE = {}
CHANNEL_HANDLERS = ()

# Make unicode go to str if applicable
try:
     CHANNEL_CACHE[unicode] = str
except NameError:
     pass

# Define a hook for the user to handle derived types gracefully.
def add_channel_handler(method):
    HANDLERS.append(method)

   ...

   def value_changed(self, new_val):
        """
        Callback invoked when the Channel value is changed.
        Parameters
        ----------
        new_val : str, int, float, bool or np.ndarray
            The new value from the channel. The type depends on the channel.
        """
        self.value = new_val
        requested = type(self.value)
        self.channeltype = CHANNEL_CACHE.get(requested, None)

        # If this is a new type we need to search for it in the handlers
        if self.channeltype is None:
              self.channeltype = requested

              # first handler that knows the type will be used
              for handler in CHANNEL_HANDLERS:
                   tp = handler(requested)
                   if tp is not None:
                        self.channeltype = tp
                        break

              # cache so we never hit the lookup for this type again
              CHANNEL_CACHE[requested] = self.channeltype

        # Equals may be faster than isinstance though this will break for derived types.
        if isinstance(self.channeltype, np.ndarray):
            self.subtype = self.value.dtype.type

Now the user can install whatever hooks they want to ensure that derived types chose the appropriate base. This doesn't force any solution and is not particular to JPype nor is magical for ints or floats. The user has to install a handler to get the alternative behavior.

Jul 19 '21 18:07 Thrameos

pydm
pydm copied to clipboard

FIX: Account for custom scalar types of incoming values

Codecov Report

pydm pydm copied to clipboard

FIX: Account for custom scalar types of incoming values

Codecov Report

pydm
pydm copied to clipboard