Farm-Data-Relay-System icon indicating copy to clipboard operation
Farm-Data-Relay-System copied to clipboard

Two way communication

Open timmbogner opened this issue 2 years ago • 9 comments

Well folks, here it is! Sorry it doesn't work yet. This is the rough draft of the two-way communication protocol. I've been looking at it all day and fixing issues, but I think it's time to ask for some help getting it the rest of the way. I'm sure there are still some silly coding mistakes left in there, so don't be shy to nitpick. They do compile though, don't worry 👍

Intended functionality:

In the "nodes":

  • Nodes first send a ping, then an 'add' packet to the first gateway that responds.
  • The gateway responds to the 'add' packet by adding the node's MAC to the ESP-NOW internal peer list, then sending an add packet response with the peer_timeout as its 'param'.
  • The node must send another add packet before this timeout, or the gateway will remove the peer from the internal peer list.

In the gateway:

  • When it receives an add packet, it saves the MAC in an array "peer_list" along with a timestamp, plus also adds it to the ESP-NOW internal peer list.
  • sendESPNOWpeers() is the exciting new action response. It sends a packet to every MAC in the internal peer list.
  • New peer_list entries are made after searching for the first blank entry. I think this is where some current failures are occurring
  • The peer list must be maintained and non-responding nodes removed. Only 16 peers are allowed at a time. I think the limit for the internal peer list is a little higher, so I'm being safe.

When all of these things are working together correctly, the node will register with the nearest gateway, and then begin receiving outgoing data from it.

timmbogner avatar Jul 31 '22 02:07 timmbogner

Sounds great! 👍 Just a small but important notice at this point in the development: I would not name sensors "nodes". Node is the subordinate concept of gateway and sensor (and whatever node type there may come in the future). Gateways and Sensors are both nodes. So you should stick with the naming convention being used in networking topology.

Gulpman avatar Jul 31 '22 15:07 Gulpman

I'm not attached to "node". I was going to make a separate "FDRS_Controller" sketch, but I realized I could combine the two sketches. Node was just the first term that came to mind. "Combo" may be better, and is also used by internal ESP-NOW functionality to express the same concept. I'm open to suggestions.

timmbogner avatar Jul 31 '22 18:07 timmbogner

I am not too familiar with ESP-NOW but after a short reading of the ESP-User Guide https://www.espressif.com/sites/default/files/documentation/esp-now_user_guide_en.pdf there it is said that Combo is one of four roles which can be IDLE, CONTROLLER, SLAVE or COMBO. With that info I'm still thinking that using the name "node" should be the general term for different nodes in a system. Here it is sensors and gateways and controllers so far. For the controller node I think actuator node would be a better fit as well: https://www.geeksforgeeks.org/difference-between-sensor-and-actuator/ as it just receives an instruction and executes it.

So these are the node types and they would have well defined names.

Within the node types we have different protocols like LoRa, ESP-NOW, MQTT, etc... Each node can support one to many of them.

In my understanding Combo referrs more to a feature which sensor and gateway nodes can activate temporarily/permanently? to negotiate with each other (join request / join response).

Having said that: This seems to only work on ESP-NOW?

Gulpman avatar Aug 01 '22 00:08 Gulpman

An actuator implies a device that transfers a signal into physical motion. In our case, the controller could be physically moving something, but it could also be an LED that we are controlling the color of, or even some data on a display. Thus, "actuator" is too specific. If we want to replace "node", we need a term for a device that can both send and receive data. We could alternatively find a catchy acronym (bad example: BIODs or "Basic Input/Output Devices").

Being honest, the internal ESP-NOW roles are pretty cryptic so I just set every device to "Combo" and they work fine. We don't need to worry about conflicting terminology with that, From my understanding though, "Combo" does mean that the ESP-NOW device is configured to both send and receive... which is what we're going for.

Actually, It doesn't even work on ESP-NOW yet. 😂I I want to extend it to LoRa once the kinks are worked out.

timmbogner avatar Aug 01 '22 03:08 timmbogner

[START TLDR] Naming conventions should be representing what is done within the object - therefore do not use node, combo or biod on the base level. Bi-Directional communiciaton (combo mode) is not possible on LoRa - so a negotiation process should be thought of primarily for uni-directional communication first to avoid running in a dead end. [END TLDR] 😉

I try to explain in my words what I understand the aim is, maybe I'm getting it wrong.

You want to extend FRDS with the functionality of negotiating between new nodes and well known (gateway) nodes to allow new nodes to join the network.

More "precise":

  • A node (with a fixed ID?) broadcasts a "hey I'm here, I want to join the party!". Having done that he sits back and listens for a response (for a specific time I guess and then repeats this until hell freezes over?).
  • Hopefully one or more gateways are getting that request.
  • All gateways who received this request add the node's MAC to a party attendants wait list.
  • They respond to this specific node with a "hey, you are on the invite list, but hurry up: You must agree to this invite within x time! (Question: Do they just respond generally or do they offer the node an ID for the network? Would be convenient in deploying / extending a network).
  • The one response which arrives first at our node willing to party is immediatly being answered with a "Yes, I definitively want to join the party, put me on the party attendees list!" (I guess our node can be a sensor or gateway or whatever node is beeing added to the network?)
  • Our node assumes he is accepted with this ID and does not wait for another reply but starts sending data to the gateway which gave him the invite (waiting a few seconds, so the gateway has time to process the "Yes, I definitively want to..." answer).
  • (In case the node would be provided with an ID he would store that internally as its unique ID.)
  • In the meantime the gateways which lost the race to the node (being 2. to n.th) remove the node from their invite list as he didn't respond
    • (Idea: maybe they wait a little bit longer as it may help with getting to know signal times between (Requesting node -> winning gateway node -> me as the loosing gateway. Further down the loosing gateway will be informed by the winning gateway.)
  • The response from the node arrives at our gateway: The gateway directly moves our node ID to the party list - node is now in the club!
  • Gateway node sends an updated message to:?
    • ALL sensor nodes in the party list, e.g. also to sensor nodes? I think that is not necessary/possible. E.g. LoRa cannot listen and send simultaneously.
    • Only gateway nodes of the party list? Seems to be a better choice as they are the masters of network topology.
  • does our winning gatway only send new+removed nodes or all nodes?

Question is how to prevent duplicate IDs - one idea is to use part of the gateway's ID/MAC as part of the creating process for the node's ID (with some fancy math).

Did I undestand this correctly?

I'm sure there are still some silly coding mistakes left in there, so don't be shy to nitpick.

If we want to replace "node", we need a term for a device that can both send and receive data. We could alternatively find a catchy acronym (bad example: BIODs or "Basic Input/Output Devices").

You are using node in the wrong context. What you want to implement is a feature (negotiating of different nodes). A node is the most general representation in a network. If we get more specific there are sensor nodes and gateway nodes and... This means gateways ARE nodes. Sensors ARE nodes. They have everything a node has + something specific.

So - whatever name it will be for "your node" / BIOD / Combo / ... - it is a feature which can be implemented and (de-)activated within a specific node type for the purpose of negotiation (or other purposes).

Nodes are the items of a network which communicate with each other. We should not missuse this well known term for a feature implementation.

An actuator implies a device that transfers a signal into physical motion. In our case, the controller could be physically moving something, but it could also be an LED that we are controlling the color of, or even some data on a display. Thus, "actuator" is too specific.

For the current "controller": actuator still would be a more reasonable term as actuators ACT on a signal - it is not physical motion but events! Thus turning a LED on or of is what an actuator does. A node, receiving control commands to open or close "coils" just acts (on a control comand being sent to it). Turning a motor on which moves something can be the job of an actuator as well. A controller implies more logic involved. A controller could be a mixture of a sensor and actuator node in the future It gets a control command from remote, uses sensor data as decision information and therfore decides what to act. I'm not dying if the node name remains controller - but namings should represent what the user / reader / developer can expect. Maybe there are controllers needed within FDRS in the future - then you have not to find an artificial naming for real controllers as the controller name is already taken for actuators.

Sticking for the low-level naming to general nodes and specific node types helps with structuring as well.

  • define everything which is specific for all nodes (e.g. IDs, basic reading functions, etc.) in a node class / fdrs_base_node.h
  • define on top the functionalities of sensor, gateway and actuator nodes in derived classes / fdrs_sensor_node.h, fdrs_gateway_node.h, fdrs_actuator_node.h, etc. which include the functionality of the base node.
  • functionalites can be split up as well - I'm not quite sure if it would be better on a protocol base or a functionalty base.

This way it is much easier to maintain the code as well as for an end user to understand what is going on.

Being honest, the internal ESP-NOW roles are pretty cryptic so I just set every device to "Combo" and they work fine.

I definitively have no issues with using combo.

We don't need to worry about conflicting terminology with that, From my understanding though, "Combo" does mean that the ESP-NOW device is configured to both send and receive... which is what we're going for.

I agree that combo is the internal ESP-NOW naming convention for send + receive at the same time. Again: It's a feature of the protocol we can use within specific node types or not. It is not the base for naming nodes. 😘

The USP of FDRS is that it combines LoRa and ESP-NOW for sensors being differently far away from their correspondig gateway. Adding a feature (simultaneously sending and receiving) which is only supported by one specific protocol is killing this USP.

Actually, It doesn't even work on ESP-NOW yet. 😂I I want to extend it to LoRa once the kinks are worked out.

I'm a little bit concerned, that you are running in a one-way direction with starting to implement something (bi-directional communication) which is only supported by one protocol (ESP-NOW). I think a better way would be to first discuss how the negotiation process can be done for half-duplex negotiation (suitable for all protocolls). If we have figured out the general algorithm one can go the extra mile to adapt it for ESP-NOW (if needed at all).

And again - before adding more features I think a clean-up (structuring) of the existing code would be the first benefical step, as also stated by @Aviatuer17 in #84

Gulpman avatar Aug 01 '22 09:08 Gulpman

A node (with a fixed ID?)

The ID doesn't matter to the gateway. When it is paired to the gateway, the controller will receive many DataReading, but only care about its own.

(for a specific time I guess and then repeats this until hell freezes over?).

Currently it does it once. The idea is that it will revert to a preset GTWY_MAC if it can't find another.

All gateways who received this request add the node's MAC to a party attendants wait list.

All gateways respond to the broadcasted ping, which I actually do need to deal with. However the add(pair/subscribe) request is sent from the controller specifically to the gateway, so there is no other gateway that should be adding the peer to its list.

They respond to this specific node with a "hey, you are on the invite list, but hurry up: You must agree to this invite within x time!

Not sure what your misunderstanding is here, but the timeout is to say "okay you can subscribe for now, but your subscription ends in 5 minutes". The reasoning is that ESP-NOW only has so many entries available in its internal peer list. So if the controller isn't around anymore, then it can give its spot to a new device.

(I guess our node can be a sensor or gateway or whatever node is beeing added to the network?)

No, it will never be a gateway. Gateways, for the foreseeable future, will be statically configured. Technically it could be, but it would only add confusion. Gateways are the pipeline, sensors and controllers tap into the pipeline.

Question is how to prevent duplicate IDs - one idea is to use part of the gateway's ID/MAC as part of the creating process for the node's ID (with some fancy math).

I skipped the rest, because I think there is a bigger misunderstanding. There is actually no problem with duplicate READING_IDs. The only ID the gateway is concerned with is the MAC address of the device that wants to pair. Once it pairs, the gateway blindly sends all of its data to the controller. If some of this data has the controller's same READING_ID, then the device does something.

A node is the most general representation in a network. If we get more specific there are sensor nodes and gateway nodes and... This means gateways ARE nodes. Sensors ARE nodes. They have everything a node has + something specific.

I mean yes, but also "node" is a pretty general term outside of network terminology and since we are already making up various terminology within FDRS, this wouldn't be a big jump for the user.

So - whatever name it will be for "your node" / BIOD / Combo / ... - it is a feature which can be implemented and (de-)activated within a specific node type for the purpose of negotiation (or other purposes).

Okay now I really think you're misunderstanding.

  • "Sensor" is a device that sends readings and never listens for packets from the gateways.
  • "Controller" would be a device that listens for packets from the gateway, but never sends any.
  • A "node" was a generic term for a device that could accomplish both things.

actuators ACT on a signal - it is not physical motion but events!

This is simply not true. In English usage, actuators ACTUATE on a signal, or physically move something. Specifically, they usually open or close a valve.

The USP of FDRS is that it combines LoRa and ESP-NOW for sensors being differently far away from their correspondig gateway. Adding a feature (simultaneously sending and receiving) which is only supported by one specific protocol is killing this USP.

The feature will be extended to LoRa. Anything possible on ESP-NOW will be available on LoRa, and vice versa. This is just a proof of concept that we can expand and perfect moving forward. I don't think anything too special will need to be done with LoRa. I think it may be appropriate to amend packet structure again to add multiple recipients (multicast), however.

If we have figured out the general algorithm one can go the extra mile to adapt it for ESP-NOW (if needed at all).

I have had the general algorithm in mind for months now, and it is explained elsewhere: basically the controller subscribes to the gateway, which then delivers all of the "outgoing" traffic to this device. The controller then sorts through them. All that the gateway needs to do is keep a list of subscriber addresses and have a function that sends to all of those addresses at once. There is an ESP-NOW method that makes this very easy, and since we actually have control of our own LoRa packet protocol, we can also implement it to that as well. I was hoping @Aviatuer17 might be interested in that. Technically ESP-NOW is our less flexible protocol,. Even with that in mind, we should generally emphasize that ESP-NOW should be used wherever possible.

I know you guys want to do the reorganization, and I do too. I was hoping to wrap this up to at least work as intended and then get the organization done and the functions split out into their own files. I'll be getting back to troubleshooting today and tomorrow.

timmbogner avatar Aug 01 '22 14:08 timmbogner

Okay, so I found the major hang-up I was having (misplaced brackets), and I am happy with this branch for now. However, I ran into issues with versions and whatnot and decided to switch things around. Let us now, at this moment, make all of the file structure changes that we want to make. I will work my new branch into restructured one later on.

This means doing the src folder if you want, putting the FDRS_Sensor/Gateway in examples, removing redundancies, and splitting the functions by interface. I'll plan do that last one after everything else is in its right place.

timmbogner avatar Aug 01 '22 23:08 timmbogner

Question:

  • The node must send another add packet before this timeout, or the gateway will remove the peer from the internal peer list.

Instead of having the node send another add packet, if there is data sent from the node to the gateway such as FDRS data shouldn't that qualify as a refresh? If no FDRS data is sent then send an add packet?

In the gateway:

  • What happens if we receive data from a node that isn't "added"? Do we just drop the data? Do we tell the node that it needs to be added? Do we automatically add the node and accept the data if there is availability?

  • If the node timeout expires do we tell the node that it has dropped from the list or just do nothing? I guess if the node is powered off and saving battery that there is nothing we can do until it tries to send additional data.

aviateur17 avatar Aug 02 '22 11:08 aviateur17

(I guess our node can be a sensor or gateway or whatever node is beeing added to the network?)

No, it will never be a gateway. Gateways, for the foreseeable future, will be statically configured. Technically it could be, but it would only add confusion. Gateways are the pipeline, sensors and controllers tap into the pipeline.

OK, makes sense.

Question is how to prevent duplicate IDs - one idea is to use part of the gateway's ID/MAC as part of the creating process for the node's ID (with some fancy math).

I skipped the rest, because I think there is a bigger misunderstanding. There is actually no problem with duplicate READING_IDs. The only ID the gateway is concerned with is the MAC address of the device that wants to pair. Once it pairs, the gateway blindly sends all of its data to the controller. If some of this data has the controller's same READING_ID, then the device does something.

OK, I have to look into your new code. I just wanted to ensure a sensor only accepts an invite from one of "his" gateways. If two or more FDRS networks are running in range it should be ensured only sensors and gateways from the same environent get paired. I hold on for now, as long as I hadn't looked into the new code running. Was referring to what you wrote at the beginning of this post. :)

A node is the most general representation in a network. If we get more specific there are sensor nodes and gateway nodes and... This means gateways ARE nodes. Sensors ARE nodes. They have everything a node has + something specific.

I mean yes, but also "node" is a pretty general term outside of network terminology and since we are already making up various terminology within FDRS, this wouldn't be a big jump for the user.

Let's agree we disagree. Nodes are the general participants of a network and for FDRS that would be a sensor, gateway or controller for now. Fairly simple.

I wanted to write I will not stress it any longer but I have to do one more time further bolow. After that I will not, promised. :)

* "Sensor" is a device that sends readings and never listens for packets from the gateways.

* "Controller" would be a device that listens for packets from the gateway, but never sends any.

* A "node" was a generic term for a device that could accomplish both things.

Let's assume we agree on sensor and controller. Then a node is still what it is by (technical) definition - a point in a network. For FDRS that would be a sensor but also a controller and also a gateway and whatever node type ther would be comming in the future. It just makes no sense to redefine this well known and established term. It only leads to confusion.

With the above definition of yours - what about naming your Sensor, Controller and node definitions as Sensor, Actuator and Controller? That would make sense for me.

Again, I will not stress this any longer, promised. :)

actuators ACT on a signal - it is not physical motion but events!

This is simply not true. In English usage, actuators ACTUATE on a signal, or physically move something. Specifically, they usually open or close a valve.

Yes, that was not 100% correct, sorry. It is physical motion and events.

I have had the general algorithm in mind for months now, and it is explained elsewhere: [...]

I think I'll act chill and lay low (hopefully I translated this correctly) at first and check out the changes you have done.

I know you guys want to do the reorganization, and I do too. I was hoping to wrap this up to at least work as intended and then get the organization done and the functions split out into their own files. I'll be getting back to troubleshooting today and tomorrow.

t's like a building: You can only build high on a good foundation. :)

Gulpman avatar Aug 02 '22 11:08 Gulpman