gz-sim icon indicating copy to clipboard operation
gz-sim copied to clipboard

Possible mistakes in RFComms plugin

Open mkrizmancic opened this issue 1 year ago • 2 comments

Hi! I believe I have found a couple of minor bugs (as in, easily fixed) in the RFComms plugin which could have non-negligible effects in cases where a precise communication model is needed.

Environment

  • OS Version: Ubuntu 20.04
  • Source or binary build? Source, ign-gazebo6, 721bc35
No log to report.

Description

1. Incorrect usage of standard deviation and variance in the normal distribution.

Input configuration for RFComms plugin takes sigma (standard deviation) for parametrizing normal distribution (see here). The structure for holding distribution parameters (RFPower) is defined with mean and variance (see here). In the structure initialization, sigma is given instead of variance (see here), and later when initializing random number generator with normal distribution parameters, a square root is taken from this variance which is actually sigma leading to unexpected values (see here). Proposed solution: In the structure initialization, set the actual value of variance, i.e., sigma squared, like

- return {_txPower - kPL, this->rangeConfig.sigma};
+ return {_txPower - kPL, pow(this->rangeConfig.sigma, 2.)};

2. Incorrect calculation of bandwidth limitation

Bandwidth limitation for sending messages is defined here. It's probably the easiest to see the problem with an example. Let's assume we are trying to send a message with the size of 10 bytes (80 bits) every second. That makes our desired bandwidth 80 bits/second. If the maximum bit-rate is set to the same 80 bits/second or a little more, say 90 bits/second, messages should be transmitted as intended. The plugin keeps track of the time and size of each message sent in the array _txState.bytesSent. Every time we attempt to send a message, the plugin goes through the list and removes all messages in that array that are older than the epoch length (1 second). Consequently, the total number of bytes sent in the current epoch is updated for each removed message. So, let's say we are trying to send a message at time t=3.000 s. This time is stored as double now = _txState.timeStamp;. Let's say that in _txState.bytesSent we have the following list of past messages:

// {timeStamp, sizeInBytes}
{1.000, 10},
{2.000, 10}

In the while loop, the plugin is removing each message for which _txState.bytesSent.front().first < now - this->epochDuration, and is also updating _txState.bytesSentThisEpoch. now - this->epochDuration == 2.000 and time of the first message in _txState.bytesSent is 1.000 so it gets removed. However, for the second message, the plugin is checking the condition 2.000 < 2.000 which of course is false, and this message does not get removed. The size of the message sent at t=2.000 s and our new message at t=3.000 s give that _txState.bytesSentThisEpoch is 20 bytes (160 bits) which is more than the defined maximum. In conclusion, we are sending messages precisely at the bandwidth limit or even a little bit below it, but the plugin forcefully drops some of them because it thinks we are above the limit. Proposed solution: Use "<=" instead of "<" in time comparison, like

- _txState.bytesSent.front().first < now - this->epochDuration)
+ _txState.bytesSent.front().first <= now - this->epochDuration)

The same thing should be done on the receiving end.

Note: I know that float comparison is a little tricky and that two seemingly equal floats do not necessarily compare to equal, but I have confirmed the described problem happens in the simulation, and changing the comparison fixes the problem.

3. (Possibly) incorrect message size calculation.

My use case for this plugin comes from the MBZIRC competition and the idea was to simulate realistic wireless communication. The teams had to make sure to serialize and compress their messages and manage possible requirements for multi-hopping. From the description on the wiki I understand that the actual payload (our serialized message) is stored in the data field of the Dataframe message. Other fields of that message are used so that Ignition can correctly apply the communication model and transmit the message to the correct robot.

From this, I would say that the size of the data field is what should actually influence the communication model. However, in the communication model, the size of the whole message is used (see here)

#if GOOGLE_PROTOBUF_VERSION < 3004001
          itSrc->second, itDst->second, msg->ByteSize());
#else
          itSrc->second, itDst->second, msg->ByteSizeLong());
#endif

Furthermore, the data field contains the already serialized message in form of a byte array. This means that getting the number of bytes in a message we are attempting to send should be as simple as just getting the length of an array. Proposed solution:

- itSrc->second, itDst->second, msg->ByteSizeLong());
+ itSrc->second, itDst->second, msg->data().size());

Again, I have implemented that change and I can confirm that the plugin behaves as expected then. The only thing I'm not sure about is the difference in protobuf version, i.e., can the same change be used regardless of the protobuf version or should I use some different function? I would say yes because it seems that bytes field of protobuf message always gets converted to std::string in C++, regardless of version.

Final remarks

I know these are small changes but for precise experiments with limited communication, they could lead to strange and unexpected behavior. Since I have already implemented points 1. and 2., I would be happy to open a pull request, and with a confirmation of the logic behind point 3., I could finalize that one as well.

mkrizmancic avatar Sep 27 '22 12:09 mkrizmancic

Thanks for the detailed report! I'll come back soon with my analysis.

caguero avatar Sep 27 '22 21:09 caguero

The three points look reasonable to me. According to (3), I don't think we need to check the protobuf version anymore. I'm happy to review the pull request when you submit the patch. Thanks!

caguero avatar Sep 28 '22 20:09 caguero