PyPMS icon indicating copy to clipboard operation
PyPMS copied to clipboard

Refactor Reader classes to use new Stream API

Open benthorner opened this issue 2 years ago • 3 comments

This introduces a couple of new abstractions - Readings and Streams - to expose more reader.py code for use by libraries. The new abstractions separate how to get readings from the CLI concerns of how many to get and how often.

Example of Stream / Reading API:

stream = SensorStream(
    sensor="PMSx003", 
    port="/dev/ttyUSB0"
)

stream.open()
reading = stream.read()
reading.obs_data  # => ObsData
reading.raw_data  # => RawData
stream.close()

The last three commits are more optional: introducing Streams isn't necessary for library usage but I think it's clearer / simpler - the final commit showcases this by DRYing-up the sampling behaviour across both of the Reader classes.

benthorner avatar Nov 20 '22 22:11 benthorner

Hey @avaldebe. Sure, I can give you the background here. The usecase I have is to be able to plug this library into a system I'm using, as a "data source". Data sources look a bit like the following:

class DataSource:
    def start(self) -> None:
        # prepare to poll for data

    def poll(self) -> Data:
        # return a single data sample

    def stop(self) -> None:
        # close down the source

At the moment there's no good way to plug the library in. "start" and "stop" are called separately so don't match up with the context manager API. And "poll" expects a single sample, not an array.

Here are two approaches I've tried to integrate the library in the system:

# Option 1: New reader for every poll
#
# Resets the sensor every time, which could damage it and
# isn't compatible with features like pre-heat.

class DataSource:
    reader: SensorReader  # "samples" set to 1

    def poll(self) -> Data:
        with reader:
            data = reader()[0]
        return data
# Option 2: Manual context manager
#
# More efficient but still clunky to manually work with 
# array return value and context manager methods.

class DataSource:
    reader: SensorReader  # "samples" set to 1

    def start(self) -> None:
        reader.__enter__()

    def poll(self) -> Data:
        return reader()[0]

    def stop(self) -> None:
        reader.__exit__(
            exception_type=None, 
            exception_value=None, 
            traceback=None,
        )

My first thought was: I will need to write my own version of SensorReader. Not a big deal, but also not ideal. However, as I looked at the reader code I saw some an opportunity, in the form of Stream*.

* "Stream" is an arbitrary name, by the way. I also thought of e.g. "Source".

My aim in this PR is to try and pull out / separate the code to get data (my usecase, the "Stream") from the code that governs how much data to get and how often (CLI usecase, the "Reader"). At the same time, I also wanted to preserve the existing Reader API, hence adding all the tests to avoid any breaking changes.

Does that help?

benthorner avatar Nov 21 '22 22:11 benthorner

I understand the usage now. Thanks for the clarification. I would like to keep the additional classes to a minimum, as the maintenance burden will be mine after your PRs are merged.

What do you think about expanding the Reader class to something like the following?

class Reader(AbstractContextManager):
    @abstractmethod
    def __call__(self, *, raw: Optional[bool] = None) -> Iterator[Union[RawData, ObsData]]:
        ...

    def start(self) -> None:
        self.__enter__()

    def poll(self,  *, raw: Optional[bool] = None) -> Union[RawData, ObsData]:
        return next(self(raw))

    def stop(self) -> None:
        self.__exit__(
            exception_type=None, 
            exception_value=None, 
            traceback=None,
        )        

avaldebe avatar Nov 22 '22 09:11 avaldebe

That works for me. A few comments / questions:

  1. read_one would be a clearer method name than poll.

  2. How about e.g. calling stop from __exit__ instead?

  3. Would you still be up for the new exception in https://github.com/avaldebe/PyPMS/pull/32/commits/7cca8f99a58b5ac544b9896284ee3857e3694783?

I would like to keep the additional classes to a minimum

Sure, if that's a problem then something like the above approach is the way to go. I'm curious though: what are the issues you anticipate with more classes? I get that loads of classes can be hard to follow, but we're only talking about 2 / 3 new classes here. I get that adding all this stuff is overkill for the features I'm interested in. My thinking with the PR was: rather than just make the Reader classes more complicated for my own benefit, first try to make them simpler.

Anyway, hopefully that's helpful context. Let me know what you think. If you prefer to keep everything inside the Reader classes, then I can make another PR to do that instead - depending on those comments / questions ☝️.

benthorner avatar Nov 22 '22 21:11 benthorner

  1. read_one would be a clearer method name than poll.
  2. How about e.g. calling stop from __exit__ instead?
  3. Would you still be up for the new exception in 7cca8f9?

Yes to all. For consistency with 2, please call __enter__ from start.

I would like to keep the additional classes to a minimum

Sure, if that's a problem then something like the above approach is the way to go. I'm curious though: what are the issues you anticipate with more classes? I get that loads of classes can be hard to follow, but we're only talking about 2 / 3 new classes here. I get that adding all this stuff is overkill for the features I'm interested in. My thinking with the PR was: rather than just make the Reader classes more complicated for my own benefit, first try to make them simpler.

The sensors implementation uses inheritance to avoid repetition between vendor variants, and when something breaks down it takes me some time to find out where the behaviours are defined. Therefore, I would like to keep things simple and inheritance shallow.

Keep in mind that, for my uses, this project is quite close to feature completion. Therefore, I only engage with it when something breaks down, and I do not want to spend time looking for the right class and...

Anyway, hopefully that's helpful context. Let me know what you think. If you prefer to keep everything inside the Reader classes, then I can make another PR to do that instead - depending on those comments / questions point_up.

You can keep this PR (and edit the PR's name and description), or start a new one. It is up to you.

avaldebe avatar Nov 23 '22 08:11 avaldebe

The sensors implementation uses inheritance to avoid repetition between vendor variants, and when something breaks down it takes me some time to find out where the behaviours are defined.

I had a similar experience when I tried to understand how the decoding works.


@avaldebe I've made a new PR that follows the approach we discussed: https://github.com/avaldebe/PyPMS/pull/33.

However, implementing read_one turned out to be harder than just doing a return next(self(raw)) because __call__ is still a bit too non-customisable / CLI-oriented e.g. with this line.

I went back to some of the ideas in this PR to make progress, but also tried to stay true to the principles and signatures we agreed on here. Please take a look and let me know what you think.

benthorner avatar Nov 26 '22 14:11 benthorner

Thanks for this and the new PR. I'll close this PR, and continue the discussion #33.

Thanks again, Á.

avaldebe avatar Nov 28 '22 08:11 avaldebe