edk2
edk2 copied to clipboard
MdePkg/SecurityPkg: Add DEBUG_TPM Bit
Description
This PR consists of four commits:
MdePkg: Add DEBUG_TPM Bit in PcdDebugPrintErrorLevel
Tcg2Dxe and its libraries are currently the noisiest modules in edk2. For a sample platform printing at INFO level, Tcg2Dxe printed 4,000 lines out of 5,700 total lines printed.
This commit defines a DEBUG_TPM bit to control the debug output of Tcg2Dxe and related TCG/TPM components. Most of the output is not useful except for deep debugging of TPM transactions, so it is appropriate to only print when the DEBUG_TPM bit is present.
SecurityPkg: Remove/Downgrade Noisy TCG Prints
The TCG code is very noisy when a TPM is connected. This commit downgrades some prints to verbose and removes some others that do not have value (such as function enter and exit prints).
SecurityPkg: Move Noisy Logs to DEBUG_TPM
The TPM code is currently very noisy (e.g. in a sample platform, 4,000 of the 5,700 lines printed to the serial port at DEBUG_INFO level were from the TPM code). For TPM debugging, this is very critical information, but for most builds it simply spams the logs and slows down the build.
This commit moves the event log and PCR dumping to log at DEBUG_TPM level.
SecurityPkg: Add Additional TPM Logging at DEBUG_TPM
This commit adds additional dumping logic to Tpm2DeviceLibDTpm, print at DEBUG_TPM to aid in TPM debugging.
- [ ] Breaking change?
- Breaking change - Does this PR cause a break in build or boot behavior?
- Examples: Does it add a new library class or move a module to a different repo.
- [ ] Impacts security?
- Security - Does this PR have a direct security impact?
- Examples: Crypto algorithm change or buffer overflow fix.
- [ ] Includes tests?
- Tests - Does this PR include any explicit test code?
- Examples: Unit tests or integration tests.
How This Was Tested
This has been shipping with platforms, tested with various physical and virtual platforms.
Integration Instructions
To get TPM related logs, set the DEBUG_TPM bit in PcdDebugPrintErrorLevel.
@jyao1 @rahul1-kumar it is the hard freeze now and this is not targeted for that, but can you still review so I can make any requested changes for when the freeze is lifted?
@mdkinney It is weird to add a dedicated debug lib for a dedicated component. Do we have any design guideline in EDKII?
Simply moving this to DEBUG_VERBOSE or defining a new error level bit does not give a platform enough granularity;
Why we cannot add a new debug level?
@jyao1 I think there may be some confusion with the name (which I can change if desired). This is not an implementation of the DebugLib library class, but rather a TPM specific library class that stores the many lines of code needed to dump all of the structures (>1000 lines of code).
The reasons this approach was taken as opposed to a DEBUG_TPM bit are:
- There are only so many debug levels to take, most of them are for the core or very large subsystems, but arguably the TPM code could fit there.
- As mentioned, there is more than 1,000 lines of code (and all the strings) needed to support dumping this, which is very different than other places that use debug level bits which have a simple 1 line per print, the TPM code has large complex structures that need dumping. By having one centralized place for doing this it enables the dumping code to be shared easily, have all that code/strings not be included in the majority of builds that don't care about debugging TPM operations, and have some place for that code to live.
There is a binary size difference with the verbose version of the lib included in Tcg2Dxe vs the null lib: ~1KB in size (even in a RELEASE build). Which is noise, but also this is dead weight the driver carries otherwise and there is other logic that may be included here in the future.
Verbose:
Null:
@jyao1 @rahul1-kumar can you please review now that the stable tag has been released?
If the concern is size, we should have better way to handle that.
Can you try to put the debug code to DEBUG_CODE() macro. (See https://github.com/tianocore/edk2/blob/master/MdePkg/Include/Library/DebugLib.h#L590C9-L590C19) ?
If you clear DEBUG_PROPERTY_DEBUG_CODE_ENABLED in PcdDebugProperyMask, all those code should be excluded in final release build.
If the concern is size, we should have better way to handle that.
Can you try to put the debug code to DEBUG_CODE() macro. (See https://github.com/tianocore/edk2/blob/master/MdePkg/Include/Library/DebugLib.h#L590C9-L590C19) ?
If you clear
DEBUG_PROPERTY_DEBUG_CODE_ENABLEDinPcdDebugProperyMask, all those code should be excluded in final release build.
Size is one consideration, but the bigger thing is that all of this code has to live somewhere and there is a lot of it. Tcg2Dxe has 2900 lines and the debug lib removes somewhere around 25% - 30% of those lines. It makes much more logical sense to move all of that cruft to a separate place where it can be maintained, extended, etc., without having to live in Tcg2Dxe. Then, this logic can also be shared by other drivers/libraries that want to use it without duplicating all of that (future investments to come). Moving this to a library keeps Tcg2Dxe and the libraries more focused and moves this functionality to a only when needed scenario, with debug code it still is present (and printing) for debug builds, meaning it is spamming all development builds (again, in a sample platform log it was 4,000 lines of logging out of 5,700 total log lines). That is not needed in most platforms, only when the TPM is actually being debugged.
@os-d, we need to understand the problem statement clearly, before create a solution.
but the bigger thing is that all of this code has to live somewhere and there is a lot of it. Tcg2Dxe has 2900 lines and the debug lib removes somewhere around 25% - 30% of those lines
If the concern is too many lines of code in this file, the option could be: move the code to a standalone file (e.g. Tpm2Debug.c) under same directory. You can still put DEBUG_CODE() around DumpTpmInputBlock(), DumpTpmOutputBlock(), DumpEventLog() etc, to reduce the size.
@jyao1 the problem statement is that Tcg2Dxe (and related libs) is the noisiest driver in the tree by orders of magnitude. There are thousands of lines of code supporting this noisiness.
The solution I am upstreaming was not written by me, but I have updated it to make the driver less noisy. It has been shipped by platforms for years and been useful in debugging. I have limited time to devote to changing this, so happy to make changes if there are legitimate concerns, but I would like to understand what your concerns with this approach are. It is an already used and written approach that solves the problem. There are many different ways of solving the same problem, all of them require some amount of work to accomplish. If that is the right investment to do that work, okay, let's do it, but if there is no specific concern with this approach, I propose we move forward with this. The current suggestions all accomplish the same thing (some have different drawbacks) but all feel like they are circling having a library for this.
@os-d, please allow me to share my concern with you about this approach.
My first concern is to add a new specific debug lib for a specific driver - that is weird because it does not follow the way we did in other driver. That is why I ask @mdkinney for the design guide line. I prefer we have a design discussion to determine what is the best way to handle this case at first. Basically, EDKII does have debug architecture, in debuglib. (And we do have similar case - network component, and we have DEBUG_NET). If that is the case, why we cannot use DEBUG_TPM? Why it cannot meet your requirement? Why you do not like my proposal? Why you have to create a new component specific debug lib? Is this guideline for all rest component, then when somebody design a new component, they must also add a component debug lib ? If that is the case, how can we use a centralized tool to manage the debug output? To me, this special new approach will drive lots of uncertainty on the EDKII debug architecture. IMHO, if there is fundamental problem in the debug architecture, then we should fix the problem altogether. If there is no fundamental problem, then we should use this debug architecture instead of creating the second way.
My second concern is the compatibility - with this change, all platforms need to change and add this new TPM2 debug lib, and need to choose debuglibnull, debuglibsimple, debuglibverbose. That is not usual, and requires the platform engineer having new knowledge. Back to first concern, if we do have fundamental problem on how we support debug, we should fix the debug architecture and propagate the knowledge to rest of EDKII developers. Handling one component in a special way is not the best design approach, IMHO.
@jyao1 I think there is still a confusion based on the name (which can be changed). Instead of calling this Tpm2DebugLib, think of this as Tpm2ExtraInformationLib. It is not an implementation of the DebugLib library class. If it were, I would agree that is the wrong approach.
Think of this more like a platform hook lib, where the TPM code provides hooks into its operations so that structures can be processed. This processing can result in debug prints, no prints, or if someone overrode the library, whatever they want (e.g. sending data to the BMC, etc.).
As I said, I didn't write this approach and honestly when I went to go fix the Tcg2Dxe noisiness, my first inclination was also to go with a DEBUG_TPM bit, but I was pointed out that our side has this Tpm2DebugLib that our partners have been using and finding use in, so I updated that to make Tcg2Dxe less noisy.
I don't have a strong preference in direction, other than we are discussing just making the noisiest driver less noisy, so I don't want to sink a bunch of time here. However, I fully recognize that our proposed solution may not be palatable and I am happy to change to something that is more palatable; in the end, my goal is to get Tcg2Dxe to not take up 4,000 lines of logging at info level.
I believe the decision choice went to this extra information lib for several reasons:
- Less impactful on the rest of the code base (there are only so many debug error level bits, those haven't changed in a long time, can we justify that TPM should get one of those bits but other large subsystems (PCI, filesystems, ACPI, etc) dont?). There are quite a few places in the codebase that may need to change to recognize a new bit (a search in the edk2 tree for DebugPrintErrorLevel returns 96 files).
- Flexibility, platforms can choose to have simple logging, no logging, very verbose, or whatever custom implementation they want
- Avoiding processing when nothing would be printed anyway, again there are 1000 lines of code to support printing this, in the case where we don't print, we still go through and parse all the structures, call into the debug infrastructure and then don't actually print anything, which wastes time
- Colocating all of the logic that exists only for parsing and dumping the structures, which does allow for code reuse (I have future upstreams for this to share the logic, but wanted to get the basis up first)
This is a different handling than most drivers take, but also Tcg2Dxe (and libs) print much, much more than any other driver. It is not so outside the norm on platform hook libs, things like https://github.com/tianocore/edk2/blob/master/MdePkg/Library/StackCheckFailureHookLibNull/StackCheckFailureHook.c are used for printing, the PeCoff extra action lib is just used for printing in some cases: https://github.com/tianocore/edk2/blob/master/ArmPkg/Library/DebugPeCoffExtraActionLib/DebugPeCoffExtraActionLib.c#L31.
DynamicTablesPkg has almost exactly the same paradigm as we are adding here: https://github.com/tianocore/edk2/blob/master/DynamicTablesPkg/Library/Common/TableHelperLib/ConfigurationManagerObjectParser.c, where a library is used to parse the large data structures and print them out.
So this is definitely not something brand new to edk2, it is just less commonly used, but I think this situation is exactly the target for it.
Please let me know your thoughts. If you still want to pursue a DEBUG_TPM bit, I can amend this PR to do so, but I hope I have addressed your concerns. Happy to chat further.
this as Tpm2ExtraInformationLib. It is not an implementation of the DebugLib library class
I think this is exactly the concern I have. This patch introduced a new component specific library class, just for dumping more information - no matter what name you give (Tpm2DebugLib or Tpm2ExtraInformationLib). If this patch just introduce a debug lib instance, then I feel it will be more acceptable, because it is still using existing debug architecture.
Less impactful on the rest of the code base (there are only so many debug error level bits, those haven't changed in a long time, can we justify that TPM should get one of those bits but other large subsystems (PCI, filesystems, ACPI, etc) dont?)
To be specific, we do have bit for filesystem - DEBUG_FS (see https://github.com/tianocore/edk2/blob/master/MdePkg/Include/Library/DebugLib.h#L42) There is not bit for PCI or ACPI, at that time, because there is no special debug need to be controlled. When the debug arch is designed, we do not assign each component a bit automatically, but introduce it based on real need. You can find DEBUG_VARIABLE, DEBUG_BM, DEBUG_BLKIO, DEBUG_NET, DEBUG_UNDI, DEBUG_MANAGEABILITY, etc, these are components having dedicated bit. (Especially, DEBUG_MANAGEABILITY is added in EDKII as addition, and it does not exist in EDKI)
Flexibility, platforms can choose to have simple logging, no logging, very verbose, or whatever custom implementation they want
First, I think current debug arch can support "simple logging, no logging, very verbose".
EDKII provides PcdDebugPrintErrorLevel (see https://github.com/tianocore/edk2/blob/master/MdePkg/MdePkg.dec#L2410).
EDKII also have gEfiDebugMaskProtocolGuid (see https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Include/Guid/DebugMask.h), which allows you to control the logging output at runtime, besides at build time.
If you want to support "custom implementation they want", then it may be supported by ReportStatus architecture.
EDKII provides REPORT_STATUS_CODE_WITH_EXTENDED_DATA (https://github.com/tianocore/edk2/blob/master/MdePkg/Include/Library/ReportStatusCodeLib.h#L428)
The simpler version REPORT_STATUS_CODE is already used in https://github.com/tianocore/edk2/blob/master/SecurityPkg/Tcg/Tcg2Dxe/Tcg2Dxe.c#L1278. It allows you register a platform hook to handle how to handle the error.
Avoiding processing when nothing would be printed anyway, again there are 1000 lines of code to support printing this, in the case where we don't print, we still go through and parse all the structures, call into the debug infrastructure and then don't actually print anything, which wastes time
I will let @mdkinney to comment that.
I am not sure how long it will take for a compiler to parse 1000 lines of code.
But if you really concerned about it and want to save time for parsing for release build, you may use MDEPKG_NDEBUG (https://github.com/tianocore/edk2/blob/master/MdePkg/Include/Library/DebugLib.h#L7-L9)
Colocating all of the logic that exists only for parsing and dumping the structures, which does allow for code reuse
I agree the principle in general. But this code is very specific to TPM2. What is real use case to reuse code?
things like https://github.com/tianocore/edk2/blob/master/MdePkg/Library/StackCheckFailureHookLibNull/StackCheckFailureHook.c are used for printing, the PeCoff extra action lib is just used for printing in some cases: https://github.com/tianocore/edk2/blob/master/ArmPkg/Library/DebugPeCoffExtraActionLib/DebugPeCoffExtraActionLib.c#L31.
I don't think it is right assumption is that those hook is just used for printing. Printing is just one implementation choice. But a platform hook allows you do more than that. E.g. StackCheck fail may record the result to UEFI variable. PeCoffExtactAction may let you do more fix up or clean up. This is designed in day one. Not something added later just to dump something out.
DynamicTablesPkg has almost exactly the same paradigm as we are adding here: https://github.com/tianocore/edk2/blob/master/DynamicTablesPkg/Library/Common/TableHelperLib/ConfigurationManagerObjectParser.c, where a library is used to parse the large data structures and print them out.
I cannot comment DynamicTablesPkg. It is a new pkg I am not involved before. After a quick look, I think it also does more than dump, than main function is https://github.com/tianocore/edk2/blob/master/DynamicTablesPkg/Library/Common/TableHelperLib/TableHelper.c.
I am not yet convinced a dedicated lib to dump something is a clean architecture, based on the fact that we already have debug architecture and report_status architecture. If there is any feature request those architecture cannot satisfy, we should enhance the architecture to support broader use case.
@jyao1 I clearly have not convinced you :)
I don't want to spend more cycles here, so I will:
- Drop the library and add a DEBUG_TPM bit in PcdDebugErrorLevel
- Move all of the code added to the lib to log at DEBUG_TPM level and be wrapped in DEBUG_CODE()
I believe this aligns with what you want, please let me know if this is acceptable.
Drop the library and add a DEBUG_TPM bit in PcdDebugErrorLevel Move all of the code added to the lib to log at DEBUG_TPM level and be wrapped in DEBUG_CODE()
Yes. I think that is OK. If you want to move the debug code to a standalone file understand same dir, it is also fine. That will only change INF file in the package. No other platform change is required. If you want to use REPORT_STATUS_CODE_WITH_EXTENDED_DATA as a hook to dump more information, that is also fine.
⚠ WARNING: Cannot add some reviewers: A user specified as a reviewer for this PR is not a collaborator of the repository. Please add them as a collaborator to the repository so they can be requested in the future.
Non-collaborators requested:
- @osteffenrh
Attn Admins:
- @jljusten
- @ardbiesheuvel
- @lgao4
- @mdkinney
- @makubacki
- @jkmathews
- @miki-intel-work
Admin Instructions:
- Add the non-collaborators as collaborators to the appropriate team(s) listed in teams
- If they are no longer needed as reviewers, remove them from
Maintainers.txt
@jyao1 @rahul1-kumar I have updated this PR to move to DEBUG_TPM as discussed (and updated title/description), please re-review.
@mdkinney @lgao4 I have updated this PR to make changes to MdePkg to add the DEBUG_TPM bit as requested by @jyao1 . Please review.
@mdkinney @jyao1 @rahul1-kumar @lgao4 please review
@mdkinney @jyao1 @rahul1-kumar @lgao4 please review
The change in MdePkg is good to me.
@jyao1 @rahul1-kumar can you please review from the SecurityPkg side? MdePkg maintainers have approved the DEBUG_SECURITY bit.