Investigation: The creation of a RabbitMQ data model

Open dillu24 opened this issue 3 years ago • 0 comments

Rationale

A developer coding alerter components might be wondering how the data sent by a dependent component looks like. Imagine I am coding the Cosmos Node Data Transformer, as it is currently the developer needs to look at the Cosmos Node Monitors' code to try and understand the data model from the logic. If we have a CosmosNodeRawDataDict type the developer can easily understand the structure of the data sent by the Cosmos Node Monitor. Furthermore, if we have a CosmosNodeRawData object we could abstract the Dict parsing by using the CosmosNodeRawData.parse function. This function would also be helpful because like this we would make sure that the data being used in the transformer is valid. Apart from this, the CosmosNodeMonitor could then use the CosmosNodeRawData.validate function to validate the CosmosNodeRawData prior to sending, hence, confirming that the widely accepted data model requirements are satisfied.

To solve this we should create a data model inside the data_models/rabbitmq_data folder for each type of data being sent to and from RabbitMQ. This data model can then be used by the publishers to validate the data, and by the consumers to parse the data correctly. Similarly to #226 , we should also create the TypedDict types that these models are going to interact with inside the src/types folder.

For ticket closure

The aim of this task is to come up with a data model design for each type of data being sent to and from RabbitMQ. For example this could include some of the following:

RawData
NodeRawData
CosmosNodeRawData
SubstrateNodeRawData
TransformedData
NodeTransformedData
CosmosNodeTransformedData
AlertData

Note that we need to create a data model for every components' input and output data.

This design should ideally be submitted on confluence. Afterwards, we can then create a number of implementation tickets such as:

Granular tickets which implement the TypedDict types
Granular tickets which implement the data models
Granular tickets which enable the components to make use of the models

You could use this minimum working example to try and understand the reasoning behind the types and the data models. Note that the code below should be used only for example purposes, hence, the developer should think of possibly other techniques to implement the problem described above:

class MetaDataDict(TypedDict):
    parent_id: str

class NodeMetaDataDict(MetaDataDict):
    node_id: str

class CosmosNodeMetaDataDict(NodeMetaDataDict):
    is_validator: bool

class CosmosNodeRawDataDict:
    meta_data: CosmosNodeMetaDataDict
    data_1: int
    data_2: float

class RawData(ABC):
    def __init__(self):
        self._parent_id = ''"

    @property
    def parent_id(self) -> str:
        return self._parent_id

    @abstractmethod
    def is_valid(raw_data: RawDataDict) -> bool:
        pass

   def parse(raw_data: RawDataDict) -> None:
       self._parent_id = raw_data['meta_data']['parent_id']

    @abstractmethod
    def to_json() -> RawData:
        pass

class NodeRawData(RawData, ABC):
    def __init__(self):
        super().__init__()
        self._node_id = ''"

    @property
    def node_id(self) -> str:
        return self._node_id

    @abstractmethod
    def is_valid(raw_data: NodeRawDataDict) -> bool:
        pass

   def parse(raw_data: NodeRawDataDict) -> None:
       super().parse(raw_data)
       self._node_id = raw_data['meta_data']['node_id']

    @abstractmethod
    def to_json() -> NodeRawData:
        pass

class CosmosNodeRawData(NodeRawData):
    def __init__(self):
        super().__init__()
        self._data_1 = 0
        self._data_2 = 0.0

    @property
    def data_1(self) -> int:
        return self._data_1

    @property
    def data_2(self) -> float:
        return self._data_2

    def is_valid(raw_data: CosmosNodeRawDataDict) -> bool:
        return schema.is_valid(raw_data)

    def parse(raw_data: CosmosNodeRawDataDict) -> None:
        super().parse(raw_data)
        self._data_1 = raw_data['data']['data_1']
        self._data_2 = raw_data['data']['data_2']

    def to_json() -> CosmosNodeRawDataDict:
        return {
            'meta_data': {
                'parent_id': self._parent_id,
                'node_id': self._node_id
            }
            'data': {
                'data_1': self._data_1,
                'data_2': self._data_2
            }
        }

Schema library we could use: https://pypi.org/project/schema/

May 31 '22 15:05 dillu24