skiboot
skiboot copied to clipboard
OPAL_FLASH_READ can take 1003ms
Spotted on a simple boot-to-OS test on Habanero, a OPAL_FLASH_READ call took 1003ms to complete.
My guess is this is opal-prd
starting up and doing a bunch of flash things.
Moving this to -> https://github.ibm.com/open-power/opal-project-mgmt/issues/61
I think this one can be closed and we can track under the opal-project-mgmt issue.
Deb McLemore [email protected] writes:
Moving this to -> https://github.ibm.com/open-power/opal-project-mgmt/issues/61
I think this one can be closed and we can track under #61.
Tracking here is useful as it's public while our internal project management issues aren't.
-- Stewart Smith OPAL Architect, IBM.
Tracking here
Ok, please assign to me (it would not allow me).
Subject: libflash API: Add 'Prepare' call to allow asynchronous behaviour
We need to be able to perform asynchronous flash operations across the various backends.
Suggestions to look at: 1 - https://github.com/stewart-ibm/skiboot/commits/async-flash 2 - posix_fadvise(POSIX_FADV_WILLNEED)
My plan to approach this task:
1 - Learn the existing libflash and related code 2 - Understand the limitations of each backend 3 - Understand the async-flash patch for applicability 4 - Design/Code a prototype 5 - Deliver patch for review and comment
@debmc any updates on this?
updates
Working async tree -> https://github.com/debmc/skiboot/tree/async
Prototype testing is looking good, need to circle back and review with current prototype implementation if the goals are achieved across the backends.
hiomap: Enable Async IPMI messaging
To provide higher layer async operations to access
the target flash, enable hiomap to perform async
ipmi messaging (post boot).
Special considerations and constraints are to prevent
recursive locking and/or polling inside OPAL calls.
Some summary op-test Unit of Work Comparisons
Good Operations Faster
Bad Operations (test only) Penalty on waiting for timeout
START runTestReadEraseWriteNVRAM
STOP runTestReadEraseWriteNVRAM
8.3 secs SYNC
6.6 secs ASYNC
START runTestReadWritePAYLOAD
STOP runTestReadWritePAYLOAD
1.87 secs SYNC
- differences are timeouts on bad ops
2.75 secs ASYNC
START runTestWriteTOC
STOP runTestWriteTOC
1.55 secs SYNC
- differences are timeouts on bad ops
2.52 secs ASYNC
Refreshed the async tree -> https://github.com/debmc/skiboot/tree/async
Tests look good, may be worth some review if anyone has some time.
Outstanding work is to circle back on error path testing.
I had a quick look. There's a bunch of formatting and logging changes in the diff which bloats out the patch, I suggest separating this out as much as possible. Additionally a sizeable chunk of callback handling has been implemented alongside what already exists, and then there are various conditional blocks to ensure one path or another is taken. Maybe I just need to take the time to read the patch properly, but I guess I was missing the "why" on first skim. I think putting a summary of the approach in the commit message would be useful.
I was also a little concerned by the addition of "moving" to the window states - the window state is meant to reflect those described by the protocol, whereas this moving state is an implementation detail, so my gut instinct is we need to think about this a bit more. A similar concern was the introduction of state to track sequence ids of specific in-flight commands; the protocol only supports one active window and all operations against that window should to be serialised, so there should not currently be a case where multiple messages are in-flight. I suspect that needs more thought as well, or a convincing story to be written in comments or the commit message.
Refreshed the async tree -> https://github.com/debmc/skiboot/tree/async
Tests look good, may be worth some review if anyone has some time.
Can you please post patches to mailing list? It becomes easy to review in mailing list.
-Vasant
the protocol only supports one active window and all operations against that window should to be serialised, so there should not currently be a case where multiple messages are in-flight
I'll add some comments to the code as suggested, some overview here:
The tracking of sequence id's is for debug of ipmi messages in OPAL, so when tracing is turned on with log-level-memory and the BT Queue we can identify any queue handling or timing issues to map hiomap ipmi messages to other OPAL ipmi non-sync messages, etc. All operations on the window are serialized.
The "moving" is an internal state, however its needed to help control the state transition since we are now waiting on the move ('ll add some comments to the code as well).
The callback handling is needing to be kicked sometimes by sync messages (during boot) and post boot by async messages, therefore we need to conditionally identify when each method type would be called at the specific point in time.
Refreshed the async tree -> https://github.com/debmc/skiboot/tree/async
Still need to do some more testing and regression testing.
Sent patch series to skiboot mailing list:
[Skiboot] [PATCH 00/12] ipmi-hiomap: Enablement for Async opal_flash_op's https://lists.ozlabs.org/pipermail/skiboot/2019-October/015624.html