[BUG] Storage: Database is locked
Describe the bug I'm using a pretty default config, with a SQLite storage configuration.
"storage": {
"type": "sqlite",
"read_records_count": 100,
"max_records_count": 100000,
"data_folder_path": "./data/",
"max_file_count": 10,
"max_read_records_count": 10,
"max_records_per_file": 10000,
"data_file_path": "/etc/thingsboard-gateway/data/data.db",
"messages_ttl_check_in_hours": 1,
"messages_ttl_in_days": 7,
"ts": 1717605380034
},
It works OK for a while, but sometimes I start getting this error:
python3[1034]: 2024-06-29 07:26:47 - |ERROR| - [database_connector.py] - database_connector - execute - 66 - database is locked
python3[1034]: Traceback (most recent call last):
python3[1034]: File "/usr/lib/python3/dist-packages/thingsboard_gateway/storage/sqlite/database_connector.py", line 62, in execute
python3[1034]: return self.connection.execute(*args)
python3[1034]: sqlite3.OperationalError: database is locked
And what's worse is that the devices that display this behavior start to loop on it in a high frequency. I suspect that it then tries to report the error to the TB server for incrementing the error counter our something like this. But as the frequency is super high, I start to get 10x data volume, eating out my API quota and triggering Rate Limits.
Connector name: Database connector
Error traceback:
python3[1034]: Traceback (most recent call last):
python3[1034]: File "/usr/lib/python3/dist-packages/thingsboard_gateway/storage/sqlite/database_connector.py", line 62, in execute
python3[1034]: return self.connection.execute(*args)
python3[1034]: sqlite3.OperationalError: database is locked
Versions:
- OS: Ubuntu 22.04.4 LTS
- Thingsboard IoT Gateway version 3.4.5
- Python version: 3.7
Hi @elgutierrez, thanks for your interest in ThingsBoard IoT Gateway! Can you please provide your config files? Because for now, I can't reproduce your issue. Also, could you try using the latest version (3.5.1) and tell us about the result?
Hey @samson0v , thanks for your response. There's nothing fancy with the config
{
"thingsboard": {
"host": "thingsboard.cloud",
"port": 1883,
"remoteShell": true,
"remoteConfiguration": true,
"statistics": {
"enable": true,
"statsSendPeriodInSeconds": 3600
},
"deviceFiltering": {
"enable": false,
"filterFile": "list.json"
},
"maxPayloadSizeBytes": 1024,
"minPackSendDelayMS": 200,
"minPackSizeToSend": 500,
"checkConnectorsConfigurationInSeconds": 60,
"handleDeviceRenaming": true,
"security": {
"type": "accessToken",
"accessToken": "xxxxxxxxxxxxxxxx"
},
"qos": 1,
"checkingDeviceActivity": {
"checkDeviceInactivity": false,
"inactivityTimeoutSeconds": 200,
"inactivityCheckPeriodSeconds": 500
},
"ts": 1717695736321
},
"storage": {
"type": "sqlite",
"read_records_count": 1000,
"max_records_count": 10000,
"data_folder_path": "./data/",
"max_file_count": 10,
"max_read_records_count": 10,
"max_records_per_file": 10000,
"data_file_path": "./data/data.db",
"messages_ttl_check_in_hours": 1,
"messages_ttl_in_days": 3,
"ts": 1717695736321
},
"grpc": {
"enabled": true,
"serverPort": 9595,
"keepaliveTimeMs": 10000,
"keepaliveTimeoutMs": 5000,
"keepalivePermitWithoutCalls": true,
"maxPingsWithoutData": 0,
"minTimeBetweenPingsMs": 10000,
"minPingIntervalWithoutDataMs": 5000,
"keepAliveTimeMs": 10000,
"keepAliveTimeoutMs": 5000,
"ts": 1703599972698
},
"connectors": [
{
"type": "bacnet",
"name": "Bacnet",
"configuration": "bacnet.json"
}
]
}
It's hard to tell when the issue appears... it starts running OK for a while and then it errors out.
And even though that's a big problem, the worst part is that the gateway enters in a loop and start sending tons of messages, triggering the rate limits.
I can confirm that we had this issue also using 3.5.1
@elgutierrez as I can see from the commits history, you need to use the version from the master branch.
So it's not released yet? Do you have an ETA for the release?
@elgutierrez on this week
On another note, I realized that having an error in the gateway (could be this issue with the DB, or maybe something else), triggers a behavior where the gateway starts looping very fast, retrying the failed command and getting the same issue. In those cases, the API is bombarded with tens/hundreds of requests per second, leading to a rate limit issue. I suspect this is caused by the way the gateway reports errors. Is there a related open issue that you know of?
Hi @elgutierrez,
Yes, we know about situations like this and it relates to the implementation of base library, at the moment we are trying to find the solution for this issue. We can just suggest to disable remote logging, to avoid a lot of messages. At the same time please describe when did you encounter the issue?
Remote logging is already disabled, it was the 1st thing I did. I guess it's more likely to be related with the error counters like
Bacnet_ERRORS_COUNT, storage_ERRORS_COUNT, etc
I have adjusted the logger, to send errors count every 5 seconds instead of on each error message. You can try to use a version from the master branch.
Hi @elgutierrez, any updates?
Hey, unfortunately I can't update my field devices with code from master branch, so I'm waiting for the new release so I can update it. Do we have an updated ETA?