zephyr
zephyr copied to clipboard
settings: nvs: Fix first write issue with cache
Issue:
When the setting nvs cache is disabled and settings_nvs_save
is called, the function reads all stored setting name entries from NVS until either finds the desired setting name entry or reaches the last stored setting name entry.
With the settings nvs cache enabled, settings_nvs_save
runs through the cached setting name entries first. If the cached entry matches with the desired one, it immediately writes the new setting value to NVS that corresponds to the cached setting name entry.
However, if the setting name entry is not found in the cache (which is the case for a new entry), settings_nvs_save
reads all stored setting name entries from NVS again. This means that even if the number of stored entries in the settings is less than the cache size, for each new setting entry to be stored settings_nvs_save
will first run through the cache, then read all stored setting name entries from NVS and only then will pick the next free name id for this new setting name entry and will finally store the new setting entry.
This makes the cache ineffiсient for every new entry to be stored even when the cache size is always able to keep all setting entries that will be stored in NVS.
Use-case:
In the Bluetooth mesh there is a Replay Protection List which keeps sequence numbers of all nodes it received messages from. The RPL is stored persistently in NVS. The setting name entry is the source address of the node and the setting value entry is the sequence number. The common use case is when RPL is quite big (for example, 255 entries).
With the current settings nvs cache implementation, every time the node stores a new RPL entry in settings (which is the first received message from a particular source address), settings_nvs_save
will always check the cache first, then also read all stored entries in NVS and only then will figure out that this is a new entry. With every new RPL entry to be stored this search time increases. This behavior results in much worse performance in comparison with when the corresponding entry was already stored. E.g. on nRF52840, with bare minimal mesh stack configuration, when the cache is bigger than number of stored entries or close to it, storing of 255 RPL entries takes ~25 seconds. The time of subsequent store of 255 RPL entires is ~2 seconds with the cache.
Solution:
This commit improves the behavior of the first write by bypassing the reading from NVS if the following conditions are met:
-
settings_nvs_load
was called, - the cache was not overflowed (bigger than the number of stored entries).
As long as these 2 conditiones are met, it is safe to skip reading from NVS, pick the next free name id and write the value immediately.
Without the fix (first write ~25 seconds):
*** Booting Zephyr OS build v3.6.0-rc2-113-gaa28f00370a7 ***
Initializing...
Bluetooth initialized
Mesh initialized
Self-provisioning with address 0x0001
Provisioned and configured!
*** storing over 255 RPL entries completed ***
total calculated: 24747ms, total measured: 24756ms
entry max: 348ms, entry min: 13ms, entry middle: 277ms
**********************************************
*** storing over 255 RPL entries completed ***
total calculated: 1678ms, total measured: 1686ms
entry max: 9ms, entry min: 4ms, entry middle: 4ms
**********************************************
*** storing over 255 RPL entries completed ***
total calculated: 1793ms, total measured: 1804ms
entry max: 99ms, entry min: 3ms, entry middle: 5ms
**********************************************
With the fix (first write ~4 seconds):
*** Booting Zephyr OS build v3.6.0-rc2-113-gaa28f00370a7 ***
Initializing...
Bluetooth initialized
Mesh initialized
Self-provisioning with address 0x0001
Provisioned and configured!
*** storing over 255 RPL entries completed ***
total calculated: 4121ms, total measured: 4127ms
entry max: 109ms, entry min: 13ms, entry middle: 18ms
**********************************************
*** storing over 255 RPL entries completed ***
total calculated: 1701ms, total measured: 1703ms
entry max: 19ms, entry min: 4ms, entry middle: 4ms
**********************************************
*** storing over 255 RPL entries completed ***
total calculated: 1780ms, total measured: 1787ms
entry max: 98ms, entry min: 4ms, entry middle: 5ms
**********************************************
@Laczen, when a new entry is stored in settings, it is cached regardless of the hash collision (settings_nvs_cache_add
always adds new entry). Until cache_next
wraps around, provided that all settings were loaded before, the cache holds all stored entries. Therefore, the second can only happen if the cache was overflowed.
when a new entry is stored in settings, it is cached regardless of the hash collision (
settings_nvs_cache_add
always adds new entry).
The same applies for loading because loading also uses settings_nvs_cache_add
.
What I think I should change is to set cf->loaded
here before brake
:
https://github.com/zephyrproject-rtos/zephyr/blob/5dad8d72ee739df5ef1b0369a5cf45ff800dec12/subsys/settings/src/settings_nvs.c#L137-L139
Because in case of loading with callback which returns an error, setting_nvs_load
will exit earlier:
https://github.com/zephyrproject-rtos/zephyr/blob/5dad8d72ee739df5ef1b0369a5cf45ff800dec12/subsys/settings/src/settings.c#L206-L209
https://github.com/zephyrproject-rtos/zephyr/blob/5dad8d72ee739df5ef1b0369a5cf45ff800dec12/subsys/settings/src/settings_nvs.c#L191-L197
@Laczen, when a new entry is stored in settings, it is cached regardless of the hash collision (
settings_nvs_cache_add
always adds new entry). Untilcache_next
wraps around, provided that all settings were loaded before, the cache holds all stored entries. Therefore, the second can only happen if the cache was overflowed.
You are right, that is why I removed my comment.