wfdb-python icon indicating copy to clipboard operation
wfdb-python copied to clipboard

Better API for specifying return data of signals

Open cx1111 opened this issue 2 years ago • 11 comments

The current rdrecord API is rather confusing when dealing with more complex records/signals. Namely: multi-frequency signals.

We should create a better API to allow users to:

  • Specify/know the exact return types/dimensions of the signals being read
  • Deal with single and multi-frequency records consistently.

In addition, the current API is also limiting when reading single-frequency records, as there is no way for users to explicitly specify whether they want 1d or 2d numpy arrays.

We current have: https://github.com/MIT-LCP/wfdb-python/blob/0d42dfb4b2946625f00cbf500d830d374a201153/wfdb/io/record.py#L1715 where the only parameter that influences the return type is: smooth_frames

We've chatted about the issue here, where I suggested a new set of params: https://github.com/MIT-LCP/wfdb-python/pull/313#issuecomment-1058363761

What do people think of the suggestion of having two params: return_dims and smooth_frames? return_dims will apply for both single and multi-frequency signals, and smooth_frames will apply only to multi-frequency signals. Given the combination of the params, the return value of the signals will be:

(1, True): List of 1d arrays. Guaranteed to be same length. (1, False): List of 1d arrays. Not guaranteed to be same length for multi-frequency signals. (2, True): Single 2d array. (2, False): Illegal combination.

cx1111 avatar May 23 '22 19:05 cx1111

@tompollard @bemoody @alistairewj thoughts?

cx1111 avatar May 23 '22 19:05 cx1111

As an application programmer, I want the data. I actually don't care what data structure is used to hold it, as long as the structure is consistent for all input records, and as long as it isn't grossly inefficient.

If you give me an NxM array, and what I really want is an M-element list of N-element arrays, I can very easily call list(x.transpose()) myself. As a library programmer, I think it's better to put that sort of trivial burden onto applications, rather than making the library interface more complicated.

In any case, there are several more combinations of arguments you didn't mention:

  • return_dims=1 and smooth_frames is unspecified
  • return_dims=2 and smooth_frames is unspecified
  • smooth_frames=True and return_dims is unspecified
  • smooth_frames=False and return_dims is unspecified
  • neither return_dims nor smooth_frames is specified

bemoody avatar May 24 '22 14:05 bemoody

Thanks. Some more thoughts:

  • Many users will want the 2d array to pass into convolutional neural networks or general multi-channel filter functions.
  • For multi-frequency records, the only way to avoid losing information is to return an array of 1d arrays.

I think it's worth it for our library to support both dimensions, despite the additional complexity. Regarding efficiency/performance, would either form perform a lot better?

I imagine for the ease of writing the functions, but at a large performance cost, we do have the option of always processing the signals as a list of 1d arrays, and then only reshaping into 2d before returning the value.

The function argument values can be required, and have defaults, so that we don't have to deal with the unspecified cases.

cx1111 avatar May 24 '22 15:05 cx1111

One-dimensional CNNs, yeah. A two-dimensional CNN (and a two-dimensional input array) would only make sense if the channels were ordered spatially, right? But I'm not here to argue with people's life choices. :)

There surely are many applications where storing each channel in contiguous memory could give slightly better performance than interleaving the channels. I believe numpy allows you to create arrays in "Fortran order" rather than "C order", so this wouldn't require any API change.

The function argument values can be required, and have defaults, so that we don't have to deal with the unspecified cases.

Well, that's the trouble. The way you've outlined it, there isn't any default value for return_dims that makes sense. To be backward compatible, return_dims would have to be "1 if smooth_frames is false, 2 if smooth_frames is true".

As a database author, I care a lot about the defaults. I think it's important that the library should expose high-resolution data by default. I don't want to have to keep explaining to people that they need to pass a mysterious extra argument to prevent the library from smudging the data. There are a few ways this could be done:

  • The library could provide both multi-frequency (e_p_signal) and low-resolution smooth-frames (p_signal) data by default (and I could tell people "just use e_p_signal and forget p_signal exists.")

  • The library could provide only multi-frequency data by default.

  • The library could provide high-resolution smooth-frames data (upsampling, not downsampling.)

However, I think the first option is the only way to really be backward compatible. There's no way to be backward compatible and provide e_p_signal and p_signal simultaneously and make p_signal high-res instead of low-res.

bemoody avatar May 24 '22 16:05 bemoody

Let's not worry about backwards compatibility here. The current API behavior is not ideal, and does not allow the user to get a predictable output.

I didn't think about the high-res option. I'm not sure how useful it would be, but we could provide it. We would then have combinations of return_dims and smooth_frames (or better name):

  • (1, None): List of 1d arrays. Not guaranteed to be same length for multi-frequency signals.
  • (1, 'average'): List of 1d arrays. Guaranteed to be same length.
  • (1, 'upsample'): List of 1d arrays. Guaranteed to be the same length. (For Record objects, what happens to fs?)
  • (2, None): Illegal combination.
  • (2, 'average'): Single 2d array.
  • (2, 'upsample'): Single 2d array.

We could also add an INFO logging statement that occurs when reading a multi-frequency record, so that it's more immediately obvious.

I realize that all of this only considers the final output array(s) (such as when calling the current rdsamp), not the intermediate steps and the object attributes used to represent the signals. Let me think about that.

cx1111 avatar May 25 '22 15:05 cx1111

I didn't think about the high-res option. I'm not sure how useful it would be, but we could provide it.

I'm honestly not sure how useful it would be either... but as you say, people like having the 2D array, and if someone insists on having a 2D array, it makes much more sense to upsample rather than downsampling.

I realize that all of this only considers the final output array(s) (such as when calling the current rdsamp), not the intermediate steps and the object attributes used to represent the signals.

I don't use wfdb.rdsamp, and I'd encourage anyone still using it to use wfdb.rdrecord instead. wfdb.rdsamp always returns a two-dimensional array, and I don't think it would be useful to change that behavior now.

rdrecord does not return an array or a list. It returns a Record object. A Record object has numerous attributes and methods. That means there's no reason a Record object cannot simultaneously provide access to many different views of the same data.

At the same time, I'm still not clear why it would be beneficial to provide data as a list-of-equal-length-arrays. I see that as being redundant (it's not any more efficient, and it's less useful for some applications, than a 2D array), and possibly confusing (at present if a Record has an e_p_signal then you know it hasn't been resampled.)

bemoody avatar May 25 '22 17:05 bemoody

An idea I mentioned in the meeting today was to allow the p_signal (and e_p_signal, etc.) attributes to be lazily loaded, perhaps if you call rdrecord with stream=True or something like that.

A very rough sketch might look like this:

def rdrecord(record_name, sampfrom=0, sampto=None, stream=False):
    record = rdheader(record_name)
    if sampto is None:
        sampto = record.sig_len
    if stream:
        record.p_signal = Lazy2DSignalArray(record, sampfrom, sampto)
    else:
        record.p_signal = _read_2d_signals(record, sampfrom, sampto)
    return record

class Lazy2DSignalArray:
    def __init__(self, record, sampfrom, sampto):
        self._record = record
        self._base = sampfrom
        self._len = sampto - sampfrom

    def __len__(self):
        return self._len

    def __getitem__(self, key):
        if isinstance(key, slice):
            (start, stop, step) = key.indices(self._len)
            data = _read_2d_signals(self._record,
                                    start + self._base,
                                    stop + self._base)
            return data[::step]
        elif isinstance(key, tuple):
            return self[key[0]][key[1:]]
        elif hasattr(key, '__index__'):
            return self[key:key+1][0]
        else:
            raise TypeError

bemoody avatar May 27 '22 17:05 bemoody

Just catching up on this!

Let's not worry about backwards compatibility here. The current API behavior is not ideal, and does not allow the user to get a predictable output.

What about deprecating rdrecord and introducing wfdb.read_record() or maybe just wfdb.read()?

What do people think of the suggestion of having two params: return_dims and smooth_frames? return_dims will apply for both single and multi-frequency signals, and smooth_frames will apply only to multi-frequency signals.

Is return_dims necessary at all? The fact that it leads to situations like (2, False): Illegal combination. isn't nice. Could we treat everything as 2d and maybe have some kind of binary flatten flag if necessary?

tompollard avatar Jun 01 '22 16:06 tompollard

What about deprecating rdrecord and introducing wfdb.read_record() or maybe just wfdb.read()?

That's the long term plan. We want a major version where old and new APIs are available, and then another one that removes the old ones.

Is return_dims necessary at all? The fact that it leads to situations like (2, False): Illegal combination. isn't nice. Could we treat everything as 2d and maybe have some kind of binary flatten flag if necessary?

flatten essentially does the same thing as return_dims and the illegal combination issue would still hold. I'd argue that it's a worse keyword to use, since flattening an array may mean different things, thus making the keyword less explicit.

cx1111 avatar Jun 07 '22 21:06 cx1111

Hot off the press: https://github.com/MIT-LCP/wfdb-python/pull/388

See the top level functions at the bottom of the module demonstrating a non-streaming and streaming (needs a lot of work) API. My biggest takeaway while playing around with this: Why does any class/object need to have the array as one of its attribute? Does it benefit the library, or the user? If we just separate the full data array from the class/object, we can remove the p_signal/d_signal/e_p_signal naming garbage.

cx1111 avatar Jun 07 '22 21:06 cx1111

^^ Thinking about it more, separating the signal data from the object may bring more confusion than benefits. Imagine reading a JPEG file into an Image object, and not having the bytes as an object field. That would be very strange.

cx1111 avatar Jun 23 '22 18:06 cx1111