atmel-software-package
atmel-software-package copied to clipboard
l2cache_clean/invalidate_region corrupts memory
There is an error in drivers/mm/l2cache_l2cc.c where the invalidation and cleaning of region overshoots by one cache line.
The region to be cleared is from start up to - but excluding - end.
Furthermore, looking at e.g. l2_cache_invalidate_region there is an extra call to l2cc_invalidate_pal(end) after invalidating the region. I suspect that this call should be replaced by l2cc_cache_sync().
Fix in attached file fix-l2-cache.patch.
Here is a way to detect the bug when it happens: provoke-bug-in-qspi.patch
The bug doesn't always happen. First of all the qspi_memcpy must use DMA, which only happens if both the data and the length is an even number of cache lines. And secondly it depends on whether the L2 cache is flushed to physical memory before we perform the DMA transfer.
Here's an example visualizing how, why and when this bug occurs.
I use spi_flash_parse_bfpt() as an example because this is where I experienced the bug, but it may be present whenever we use DMA.
Preconditions
Imagine having filled the stack with 0xcb and all caches are updated.
Physical memory
| Address of rx_buffer | Address after rx_buffer |
|---|---|
| 0xcb * 32 | 0xcb * 32 |
Cache
| Address of rx_buffer | Address after rx_buffer |
|---|---|
| 0xcb * 32 | 0xcb * 32 |
Read from spi-nor flash using DMA
When calling spi_flash_parse_bfpt() in drivers/nvm/spi-nor/sfdp.c, upon entry a number of registers will be pushed onto the stack. Amongst other registers, the LR address will be pushed onto the stack. They will end up in the L2 cache at first - not in physical memory - unless the cache needs to be pushed down.
Let us assume the stackpointer is located so that the saved variables ends up somewhere in Address after rx_buffer, and the struct sfdp_bfpt bfpt ends up on the 32 bytes aligned address that is Address of rx_buffer.
In the beginning of the function our bfpt is memset(&bfpt, 0, sizeof(bfpt)).
Since nothing needs to be flushed to physical memory, this is how the memory looks right now:
Physical memory
| Address of rx_buffer | Address after rx_buffer |
|---|---|
| 0xcb * 32 | 0xcb * 32 |
Cache
| Address of rx_buffer | Address after rx_buffer |
|---|---|
| 0x00 * 32 | Saved registers (..., LR) |
Performing qspi_memcpy
A bit later we will read into bfpt using DMA (provided its size is a multiple of 32 bytes and it's address is 32 bytes aligned).
The memory and cache will look like this:
Physical memory
| Address of rx_buffer | Address after rx_buffer |
|---|---|
| data from spi-flash | 0xcb * 32 |
Cache
| Address of rx_buffer | Address after rx_buffer |
|---|---|
| 0x00 * 32 | Saved registers (..., LR) |
If we now call l2_cache_invalidate_region (the bad version which overshoots by 32 bytes) it will clear the contents of the cache and force a re-read from physical memory. This overwrites the saved registers - and specifically will fill the saved LR register with 0xcbcbcbcb.
Physical memory
| Address of rx_buffer | Address after rx_buffer |
|---|---|
| data from spi-flash | 0xcb * 32 |
Cache
| Address of rx_buffer | Address after rx_buffer |
|---|---|
| data from spi-flash | 0xcb * 32 |