ArduinoJson icon indicating copy to clipboard operation
ArduinoJson copied to clipboard

Feature request: on demand heap allocation

Open nomis opened this issue 4 years ago • 5 comments

I'm storing configuration in SPIFFS and so I have to size the DynamicJsonDocument large enough that it won't unexpectedly fail to read/write the whole configuration file.

The problem I have is that this allocation is then always quite large and made in one request from the heap, which significantly fragments the heap. I'd like it to make additional allocations as required from the heap up to a specified limit instead. I don't need it to free the memory until the document is destroyed.

I know you've discounted this on the basis that it causes more heap fragmentation but I disagree. The fragmentation of the heap is affected by the size of allocations being made. By always making one big allocation (5% of total memory) there's a significant risk that once freed the space occupied by that allocation will be reused by smaller allocations and a contiguous block of that size will no longer be available from the heap.

nomis avatar Aug 18 '19 08:08 nomis

Hi @nomis,

Thank you for this question. Even if I already discarded this kind of request, it's always good to question the fundamentals and see if we can make the library evolve.

always quite large and made in one request from the heap, which significantly fragments the heap

I strongly disagree. As I explained, constant allocations don't increase the fragmentation, regardless of the size.

there's a significant risk that once freed the space occupied by that allocation will be reused by smaller allocations

This sounds like memory fragmentation introduced by other parts of the code; ArduinoJson cannot be accountable for these. You have to fight fragmentation in all areas of your program.

contiguous block of that size will no longer be available from the heap.

That's the definition of heap fragmentation. If you cannot allocate a block of 5% of the total, then the fragmentation above 95%, this is extremely high!

How could ArduinoJson allocate this memory anyway?

Best Regards, Benoit

bblanchon avatar Aug 19 '19 07:08 bblanchon

always quite large and made in one request from the heap, which significantly fragments the heap

I strongly disagree. As I explained, constant allocations don't increase the fragmentation, regardless of the size.

The measurement of heap fragmentation depends on the largest block that could be allocated. By making larger allocations than you need you increase the fragmentation rather than reduce it.

there's a significant risk that once freed the space occupied by that allocation will be reused by smaller allocations

This sounds like memory fragmentation introduced by other parts of the code; ArduinoJson cannot be accountable for these. You have to fight fragmentation in all areas of your program.

ArduinoJson would be accountable because it needs a contiguous 5% block of the heap to work. It is inevitable that the data from the ArduinoJson document needs further allocation to copy it elsewhere. That then leaves a 5% sized gap in the heap that other processes will then use. Repeat this 20 times and there is no longer enough space for a contiguous 5% block.

contiguous block of that size will no longer be available from the heap.

That's the definition of heap fragmentation. If you cannot allocate a block of 5% of the total, then the fragmentation above 95%, this is extremely high!

How could ArduinoJson allocate this memory anyway?

In smaller chunks as and when it needed it. Then it would fit in the available memory regardless of how fragmented the heap is.

(If the source buffer is seekable you don't need to store the entire document in memory to be able to read it, but that would use a lot more CPU time.)

nomis avatar Aug 19 '19 12:08 nomis

By making larger allocations than you need you increase the fragmentation rather than reduce it.

Again, I strongly disagree. However, the critical point is to make fixed allocations, which are necessarily large enough for your biggest use case.

ArduinoJson would be accountable because it needs a contiguous 5% block of the heap to work

This is not an unreasonable requirement. Moreover, if ArduinoJson were to make hundreds of small allocations like other libraries, it would:

  1. require way more than 5% for large inputs
  2. dramatically fragment the memory
  3. be ridiculously slow

That then leaves a 5% sized gap in the heap that other processes will then use.

If you follow best practices in all places, this gap will still be available in a million iteration. However, if you cannot change the other parts, you can keep the DynamicJsonDocument alive to reserve the memory.

(BTW, is there really more than one "process" running on your microcontroller ?)

In smaller chunks as and when it needed it.

I get your point instead of one large buffer, it could allocate several medium-sized buffers, like AduinoJson 5 did. Removing this feature allowed performance and memory usage optimizations (mainly by removing all virtual and storing offsets between node). Maybe I'll go this way for ArduinoJson 7, but for now, small microcontrollers like AVRs need these optimizations, so I cannot remove them.

bblanchon avatar Aug 20 '19 09:08 bblanchon

By making larger allocations than you need you increase the fragmentation rather than reduce it.

Again, I strongly disagree. However, the critical point is to make fixed allocations, which are necessarily large enough for your biggest use case.

The ESP8266 has 80KB of RAM. No one process running on the device can afford to make large fixed size allocations.

ArduinoJson would be accountable because it needs a contiguous 5% block of the heap to work

This is not an unreasonable requirement. Moreover, if ArduinoJson were to make hundreds of small allocations like other libraries, it would:

1. require **way more** than 5% for large inputs
2. dramatically fragment the memory

I didn't suggest it had to be the default option.

3. be ridiculously slow

I disagree. Your memory allocator is significantly faster than the one provided by the platform?

That then leaves a 5% sized gap in the heap that other processes will then use.

If you follow best practices in all places, this gap will still be available in a million iteration.

I have no other control over what the platform is doing.

However, if you cannot change the other parts, you can keep the DynamicJsonDocument alive to reserve the memory.

(BTW, is there really more than one "process" running on your microcontroller ?)

Yes, the ESP8266 has multiple tasks running that I'm not in control of. I'm also not in control of memory allocation... every time I want memory for long term allocation it could come from one of the 5% gaps.

In smaller chunks as and when it needed it.

I get your point instead of one large buffer, it could allocate several medium-sized buffers, like AduinoJson 5 did. Removing this feature allowed performance and memory usage optimizations (mainly by removing all virtual and storing offsets between node). Maybe I'll go this way for ArduinoJson 7, but for now, small microcontrollers like AVRs need these optimizations, so I cannot remove them.

What I really need is probably an event driven parser, so that minimum memory is used to populate my config object.

nomis avatar Aug 20 '19 11:08 nomis

I hope I am not hijacking this issue, by sharing a similar experience. My code was running fine on ArduinoJson 5.11.x (with some minor code changes, e.g. added bufferedprint to reduce fragmentation, clear(new size) function). Recently I decided to upgrade to v6, but discovered it now runs out of memory (ESP8266), probably related to heap fragmentation.

Comparable to @nomis, the code serves client requests and reads JSON files from SPIFFS. The objects stay in memory and are changed until the client disconnects.
All is implemented through a templated class.

In V6, we need to make a guess on the size of DynamicJsonDocument structure to allocate. Most (larger) JSON docs require less than the file size, but some (mostly smaller) ones require ~3x the size. Initially I used the dumb approach and allocated 4x the file size, but this rapidly creates a fragmentation problem because at some point for larger files no contiguous space can be found. Then I implemented a loop that slowly increases the size of the DynamicJsonDocument until deserialize succeeds, but ultimately this also leads to a fragmentation problem. Deep copy to a new DynamicJsonDocument of exact size, makes it run a whole day longer.

Finally I settled for a combination of DynamicJsonDocument(filesize) for files >3K and a big switch (or cascaded if) with locally defined StaticJsonDocument's of growing size until the parsing succeeds for files that fail to deserialize, after which I create a DynamicJsonDocument of exact fitting size which is then stored inside a class instance. Since StaticJsonDocument is allocated on the stack, it does not create fragmentation and this code has not crashed. Since StaticJsonDocument is only used on smaller files it does not hit the 4K stack limit on ESP8266.

Compared to V5, the code looks now looks a bit more messy due to the sizing process, but the probably more delicate part is that these fragmentation failures occur after running for several days, so they are not so easy to detect in testing. On the other hand I don't have to keep the Json file in memory, even though I abstracted that away in the class.

Maybe there is a better way to do this, as I am still very new to v6 and learning it's possibilities. But if not, I can understand the request for a dynamically growing DynamicJsonDocument e.g. two linked list of memory blocks, one for the strings with initial size (almost) equal to Json string and one for the slots (eg. 256 bytes that can turn into an array of pointers to blocks).

Of course, there is still the luxury situation for me to go back to V5, which is still a superb library that has never failed once I understood its challenges of embedding it inside a class and the need to keep the source JSON string in memory together with the DynanicJsonBuffer. Speed nor fragmentation have ever been an issue on ESP, even though now I don't really fully understand why fragmentation never caused problems. Maybe it just survived one week as the code reboots the ESP every Sunday.

ewaldc avatar Aug 29 '19 18:08 ewaldc

I've just upgraded from 5 to 6 – the biggest disappointment is me being forced to anticipate the size when using DYNAMIC document.

Arduino's String is actually dynamic – its size grows automatically with data (characters/strings) I add to it. If one is worried about heap fragmentation, they can use reserve(). That's what I'd expect from a dynamic feature.

I know the argument: microcontroller applications are not really "generic" -> you can anticipate all the use cases most of the time.

BUT I DON'T WANNA XD

It is a different situation when using Arduino Mega and using ESP chips with so much RAM, I don't really care. And when I approach a moment that I will need to start to care -> then I can always apply optimization techniques, including hardcoding the size of Json Document.

MacDada avatar Mar 28 '23 14:03 MacDada

This feature is available in branch 7.x. StaticJsonDocument and DynamicJsonDocument were replaced by a single JsonDocument class with elastic capacity.

bblanchon avatar Jul 18 '23 10:07 bblanchon