fiware-cygnus Cygnus merges notifications when receiving a large volume of them in a short period of time.

Today I've been hit with what seems to be a weird bug in Cygnus. A little background first:

We have a device that instead of sending the measurements as they are collected, it accumulates them (every ~10 minutes, including the time of the measurement) and then delivers all those measurements in a batch at specific points in time (every several hours). This batches are sent to the iotagent-json, that updates the matching entity in Orion and a notification is sent to Cygnus for every measurement of the batch. So far, so good. Cygnus is configured with PostgreSQL and Hadoop as backends. Orion notifications have been configured with no throttling, as we want to store every measurement.

For the PostgreSQL backend we have configured:

cygnus-ngsi.sinks.postgresql-sink.attr_persistence = row
cygnus-ngsi.sinks.postgresql-sink.data_model = dm-by-service-path

As one of the attributes is the date of the measurement, we want to obtain all the attributes in the database that were measured at that date. Then I thought that getting all the attributes that matches the recvtimets of the date attribute would give me that. Then i saw something like this:

postgres=# select * from test.sensors where recvtimets='1529406607326' and attrname='measurement_date';
  recvtimets   |         recvtime         | fiwareservicepath | entityid | entitytype |     attrname     | attrtype |   attrvalue   |                                                                                                                attrmd
---------------+--------------------------+-------------------+----------+------------+------------------+----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 1529406607326 | 2018-06-19T11:10:07.326Z | /sensors          | sensor01 | mysensor   | measurement_date | string   | 1529385000000 | [{"name":"TimeInstant","type":"ISO8601","value":"2018-06-19T11:10:05.451Z"},{"name":"formato","type":"string","value":"epoch"},{"name":"nombre","type":"string","value":"fecha/hora"},{"name":"unidad","type":"string","value":"ms"}]
 1529406607326 | 2018-06-19T11:10:07.326Z | /sensors          | sensor01 | mysensor   | measurement_date | string   | 1529377200000 | [{"name":"TimeInstant","type":"ISO8601","value":"2018-06-19T11:10:05.473Z"},{"name":"formato","type":"string","value":"epoch"},{"name":"nombre","type":"string","value":"fecha/hora"},{"name":"unidad","type":"string","value":"ms"}]
 1529406607326 | 2018-06-19T11:10:07.326Z | /sensors          | sensor01 | mysensor   | measurement_date | string   | 1529386800000 | [{"name":"TimeInstant","type":"ISO8601","value":"2018-06-19T11:10:05.463Z"},{"name":"formato","type":"string","value":"epoch"},{"name":"nombre","type":"string","value":"fecha/hora"},{"name":"unidad","type":"string","value":"ms"}]
  (3 rows)

Three different measurements had the same recvtimets. First I thought it had something to do with the iotagent, but every measurement was sent independently and the log shows no errors. Checking Orion, everything was ok too. Then I checked the log for Cygnus and there were one notification received for each of the measurements, so ok too. Checking the data stored in the Hadoop backend, it shows the same problem.

So somewhere between Cygnus receiving the notifications and Cygnus storing them in the backend, it merges some of them as if they were received as a single notification.

I've been able to reproduce this behaviour with the following setup:

Mosquitto MQTT Broker
Iotagent-json
Orion Context Broker
Cygnus
PostgreSQL

Using the sensor from the step by step guide I then sent temperature measurements with:

for i in $(seq 1 200) ; do mosquitto_pub -t /1234/sensor01/attrs -m "{\"t\": $i}" -u admin -P admin ; done

And checking the database, some notifications have been merged:

postgres=# select * from myhome.environment where recvtimets='1529934582282';
  recvtimets   |         recvtime         | fiwareservicepath |     entityid     | entitytype  |  attrname   | attrtype | attrvalue |                                    attrmd
---------------+--------------------------+-------------------+------------------+-------------+-------------+----------+-----------+------------------------------------------------------------------------------
 1529934582282 | 2018-06-25T13:49:42.282Z | /environment      | LivingRoomSensor | multiSensor | Temperature | celsius  | 177       | [{"name":"TimeInstant","type":"ISO8601","value":"2018-06-25T13:49:41.913Z"}]
 1529934582282 | 2018-06-25T13:49:42.282Z | /environment      | LivingRoomSensor | multiSensor | Temperature | celsius  | 169       | [{"name":"TimeInstant","type":"ISO8601","value":"2018-06-25T13:49:41.856Z"}]
(2 rows)

I've been searching, but have been unable to find anything related to this bug.

Thanks.

Regards.

Jun 27 '18 10:06 dmabtrg

Cygnus by default works in batch mode. You can try to reduce batch size (CYGNUS_POSTGRESQL_BATCH_SIZE) or decrease timeout (CYGNUS_POSTGRESQL_BATCH_TIMEOUT) in order to that different measures would be stored with different recvtimet.

Jun 27 '18 10:06 AlvaroVega

I think the recvtimets is generated when converting the notification into a NGSIEvent object and not at the sink batch processing stage, before putting it into the channel. Anyway, I tried your suggestion and there was no difference in the results, but setting the timeout to 0 seems to disable the batch processing.

Thanks.

Regards.

Jun 27 '18 14:06 dmabtrg

Could you try using MySQL as storage backend instead of PostgreSQL? Just to check if the behaviour is the same (in which case, maybe the problem is in the common event handler) or if in that case is correct (in which case, the problem should be in the PostgreSQL sink).

Jun 27 '18 17:06 fgalan

I'm also using the HDFS backend and the same problem is present there, with the exact same values. Here's an example:

PostgreSQL

cygnus=# select * from testservice.env where recvtimets = '1530167672972';
  recvtimets   |         recvtime         |  fiwareservicepath   |     entityid     | entitytype  |  attrname   | attrtype | attrvalue |                                    attrmd                                    
---------------+--------------------------+----------------------+------------------+-------------+-------------+----------+-----------+------------------------------------------------------------------------------
 1530167672972 | 2018-06-28T06:34:32.972Z | /environment/sensors | LivingRoomSensor | multiSensor | Temperature | celsius  | 175       | [{"name":"TimeInstant","type":"ISO8601","value":"2018-06-28T06:34:32.776Z"}]
 1530167672972 | 2018-06-28T06:34:32.972Z | /environment/sensors | LivingRoomSensor | multiSensor | Temperature | celsius  | 181       | [{"name":"TimeInstant","type":"ISO8601","value":"2018-06-28T06:34:32.778Z"}]
(2 rows)

HDFS

root@hadoop-namenode:/# hdfs dfs -ls -R /
drwxr-xr-x   - root supergroup          0 2018-06-28 06:34 /user
drwxr-xr-x   - root supergroup          0 2018-06-28 06:34 /user/root
drwxr-xr-x   - root supergroup          0 2018-06-28 06:34 /user/root/myhome
drwxr-xr-x   - root supergroup          0 2018-06-28 06:34 /user/root/myhome/environment
drwxr-xr-x   - root supergroup          0 2018-06-28 06:34 /user/root/myhome/environment/sensors
drwxr-xr-x   - root supergroup          0 2018-06-28 06:34 /user/root/myhome/environment/sensors/LivingRoomSensor_multiSensor
-rwxr-xr-x   2 root supergroup      51425 2018-06-28 06:35 /user/root/myhome/environment/sensors/LivingRoomSensor_multiSensor/LivingRoomSensor_multiSensor.txt
root@hadoop-namenode:/# hdfs dfs -cat /user/root/myhome/environment/sensors/LivingRoomSensor_multiSensor/LivingRoomSensor_multiSensor.txt | grep '2018-06-28T06:34:32.972Z'
{"recvTime":"2018-06-28T06:34:32.972Z","fiwareServicePath":"/environment/sensors","entityId":"LivingRoomSensor","entityType":"multiSensor", "Temperature":"175", "Temperature_md":[{"name":"TimeInstant","type":"ISO8601","value":"2018-06-28T06:34:32.776Z"}]}
{"recvTime":"2018-06-28T06:34:32.972Z","fiwareServicePath":"/environment/sensors","entityId":"LivingRoomSensor","entityType":"multiSensor", "Temperature":"181", "Temperature_md":[{"name":"TimeInstant","type":"ISO8601","value":"2018-06-28T06:34:32.778Z"}]}

I could try the MySQL backend later if that's still needed.

Jun 28 '18 06:06 dmabtrg

Having confirmed the save behaviour in the HDFS storage backend, probably makes the additional test on MySQL unnecessary. Although it is also true that the most situations in which the issues is verified, the better. In summary, as you prefer :)

Jun 28 '18 13:06 fgalan

I've tested it, just in case. Same results:

mysql> select * from environment_sensors_LivingRoomSensor_multiSensor where recvtimets = '1530197123903';
+---------------+-------------------------+----------------------+------------------+-------------+-------------+----------+-----------+------------------------------------------------------------------------------+
| recvTimeTs    | recvTime                | fiwareServicePath    | entityId         | entityType  | attrName    | attrType | attrValue | attrMd                                                                       |
+---------------+-------------------------+----------------------+------------------+-------------+-------------+----------+-----------+------------------------------------------------------------------------------+
| 1530197123903 | 2018-06-28T14:45:23.903 | /environment/sensors | LivingRoomSensor | multiSensor | Temperature | celsius  | 14        | [{"name":"TimeInstant","type":"ISO8601","value":"2018-06-28T14:45:23.722Z"}] |
| 1530197123903 | 2018-06-28T14:45:23.903 | /environment/sensors | LivingRoomSensor | multiSensor | Temperature | celsius  | 10        | [{"name":"TimeInstant","type":"ISO8601","value":"2018-06-28T14:45:23.721Z"}] |
+---------------+-------------------------+----------------------+------------------+-------------+-------------+----------+-----------+------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

PostgreSQL and HDFS have the same values.

Jun 28 '18 14:06 dmabtrg

Which Cygnus version are you using? The one comming from master branch?

Jun 28 '18 15:06 fgalan

I'm using the latest docker image available in docker hub (this build):

$ docker images --no-trunc | grep cygnus | grep latest
fiware/cygnus-ngsi                               latest              sha256:7b834ce9560d712a200d08a1c9e05962e5fb6f9b1d1926a1765816550cc38994   39 hours ago        516MB

From the container log:

$ docker-compose logs cygnus | grep version
cygnus_1                  | time=2018-06-29T07:14:27.816Z | lvl=INFO | corr=N/A | trans=N/A | srv=N/A | subsrv=N/A | comp=N/A | op=main | msg=com.telefonica.iot.cygnus.nodes.CygnusApplication[171] : Starting Cygnus, version 1.9.0_SNAPSHOT.5e63327cecadb2dfc650b83ca3c7a60ae5968bb9

Jun 29 '18 07:06 dmabtrg

I've also been hit by this bug. I'm using HDFS sink & sending two entities with different id & type to Orion to which Cygnus is subscribed. But Cygnus is merging both notifications & storing the data in the same text file in hdfs. I'm using Cygnus V1.10.0

Jan 31 '19 10:01 vjain641

fiware-cygnus fiware-cygnus copied to clipboard

Cygnus merges notifications when receiving a large volume of them in a short period of time.

fiware-cygnus
fiware-cygnus copied to clipboard