DAPLink
DAPLink copied to clipboard
swd_read_block() / swd_write_block() misalignment
Hello,
the two functions swd_read_block() and swd_write_block() will stumble into misalignment, if their callers swd_read_memory() and swd_write_memory() are called with differently aligned src/dst, e.g. "address"=0x20001ef1 (target), "data"=0x20013ee0.
Important part are the last digits. If the addresses have a different alignment, swd_read_block()... crashes, because the function actually expects "data" to be 32bit aligned.
Fix is easy: first read the 32bit value from the target into an aligned value and then do a memcpy() from this value to the actual destination.
If performance is an issue, the special case "misaligned" has to be handled extra.
If I should provide a PR, let me know.
It seems swd_read_memory()
and swd_write_memory()
would only work if address
and data
have the same alignment.
I think there are two options. One would be to modify swd_read_block()
and swd_write_block()
to do two different for-loops (+ last word in read case) depending on whether data
is aligned or not. Considering this issue has been there for a while, it does not seem like unaligned reads/writes are common. A second approach would be to modify swd_read_memory()
/ swd_write_memory()
by removing the first block that reads/writes until word aligned, only do the word-aligned block reads/writes if both address are aligned. The consequence would essentially be that if address
or data
are not word-aligned, reads or writes will be done one byte at a time.
I am curious to know in which situation you have encountered this issue.
My use case is reading the RTT terminal channel from the target: on target side the data is written in characters into a FIFO, the destination on the host is a 64 (or so) byte buffer, meaning data on the host is aligned to 32bit. The same will happen on RTT writes.
I guess that most of the "standard" use cases are using similar data structures on both sides so different alignment between host and target does not happen.
You can check my proposed solution which implements your first option here: https://github.com/rgrr/picoprobe/commit/81f73df0c3b3dda1a864832b2cdddc40d18d8c2f#diff-32917a6532b0d4105496fe49a58d9c8d247b6f89f4b7330c86a1c87a3c81b74d. In the unaligned case, it reads the data first into an aligned buffer (the 32bit word) and then does a memcpy(). Actually one can even drop the "aligned case", because I don't think that the additional memcpy() really harms throughput. Concerning your suggested second option: please not, because this would really kill throughput, because the bottleneck is obviously the SW interface.