tpm2-tss icon indicating copy to clipboard operation
tpm2-tss copied to clipboard

[RFC]: add a logging handler callback

Open williamcroberts opened this issue 3 years ago • 21 comments

[RFC]: add a logging handler callback

Just some code to guide a discussion on how to add a logging callback
routine to make applications and bindings uses the various apis easier
to integrate into different logging mechanisms. This would be nice for
the Python bindings.

Changed:
- Updated the API to call with various params over just a single string.
- Added SYS and ESYS routines.

TODO:
1. Add FAPI support
2. Test
3. Fiddle with names, style and locations in the code base

williamcroberts avatar Jun 07 '21 17:06 williamcroberts

@joholl @AndreasFuchsSIT something to guide discussion with

williamcroberts avatar Jun 07 '21 17:06 williamcroberts

This callback handler is too low-level, imo. It handles logging on string level, i.e. you cannot change the format. Also, on python level you would need to parse the logs again to extract file name, line number etc.

For reference, a python log record needs to know: module name, log level, file name, line number, function name and message (see class LogRecord).

Therefore, I'd suggest that the callback handler should look something like this. I guess all of those are null-terminated strings, including the message, so no size needed.

typedef void (*TSS2_LOG_HANDLER)(
            char      const *log_module,
            log_level        loglevel,
            char      const *file_name,
            uint32_t         line_no,
            char      const *function_name
            char      const *message);

In this case, we'd need to move the formatting from snprintf(NULL, 0, "%s:%s:%s:%d:%s() %s \n", log_strings[loglevel], module, file, line, func, msg); into the default log handler.

joholl avatar Jun 08 '21 08:06 joholl

Agreed with @joholl

The big problem I see here however is when you bind against two of the tss2 libraries at the same time. I.e. we cannot have a global function call, since linking will break this.

Options I see:

  1. Have a libtss2-log.so where the global loghandler logic is kept
  2. Have a Esys_SetLogHandler, Fapi_SetLogHandler, TctiLdr_SetLoghandler, etc call for global state without a context parameter
  3. Have a Esys_SetLogHandler, Fapi_SetLogHandler, TctiLdr_SetLoghandler, etc call for per context setting
  4. Allow a file-descriptor for log output via an environment variable
  5. other ideas ?

AndreasFuchsTPM avatar Jun 08 '21 08:06 AndreasFuchsTPM

@AndreasFuchsSIT Good point! Tbh, those are all somewhat ugly...

Btw, I already added support for logging to an fd (#1976). Additionally, I implemented a PoC on python side (tpm2-software/tpm2-pytss#164) but parsing logs on python level in a separate thread is really not the most elegant solution.

I have another suggestion: let's not add a callback mechanism but just a logging function as a weak symbol. I think we should be able to overwrite it, then.

joholl avatar Jun 08 '21 11:06 joholl

@joholl I was referring to a custom fd > stderr, e.g. fd = createPipe(); setenv("TSS2_LOGFILE", "fd:%i" % fd);

Could you explain the weak symbol thing and how to use it here to me ? Is it something like a dlsym(dlopen(NULL), "mysymbol"); ? And is it C99 compliant ?

AndreasFuchsTPM avatar Jun 08 '21 12:06 AndreasFuchsTPM

Agreed with @joholl

The big problem I see here however is when you bind against two of the tss2 libraries at the same time. I.e. we cannot have a global function call, since linking will break this.

Sorry I actually forgot to add some of the files (added now), but that I don't think is correct. Each library is statically linked to the logging code, which means each instance of the library has it's own global state. Ie, a call to changing to Sys will not modify Esys.

I did a test program just to confirm my understanding and we never see duplicate functions in the output, which I think confirms my understanding:

Code

#include <stdio.h>
#include <tss2/tss2_sys.h>
#include <tss2/tss2_esys.h>

void sys_handler(const char *msg, size_t size) {
	printf("SYS: %s\n", msg);
}

void esys_handler(const char *msg, size_t size) {
	printf("ESYS: %s\n", msg);
}

int main ()
{

    printf("Setting NEW Sys Handler: %p\n", sys_handler);
    TSS2_LOG_HANDLER old = Tss2_Sys_SetLogHandler(sys_handler);
    printf("SYS Old Handler: %p\n", old);
    
    printf("NEW ESys Handler: %p\n", esys_handler);
    old = Esys_SetLogHandler(esys_handler);
    printf("ESYS Old Handler: %p\n", old);

    return 0;
}

Building:

gcc -o go a.c `pkg-config --libs tss2-esys tss2-sys`

Output

./go
Setting NEW Sys Handler: 0x55f12dc3f189
SYS Old Handler: 0x60ef5c93
NEW ESys Handler: 0x55f12dc3f1b8
ESYS Old Handler: 0x7f9760f9a9e3

Options I see:

  1. Have a libtss2-log.so where the global loghandler logic is kept

That would then break it and the logic as we need a non-shared lib here IIUC.

  1. Have a Esys_SetLogHandler, Fapi_SetLogHandler, TctiLdr_SetLoghandler, etc call for global state without a context parameter

Yep, I added in this update, at least for Esys and Sys so folks can play with it more.

  1. Have a Esys_SetLogHandler, Fapi_SetLogHandler, TctiLdr_SetLoghandler, etc call for per context setting

I would say no to this one. The global library logger would then get used for context creation logs, and then another one used after that. Seems confusing, and can be added later if a user requests it.

  1. Allow a file-descriptor for log output via an environment variable

fd's aren't portable, then we end up in windows land again.

  1. other ideas ?

I like exposing the the more raw logging callback, I just tried to keep it dead simple, but as @joholl mentions, puts us into grepping strings for level.

williamcroberts avatar Jun 08 '21 16:06 williamcroberts

This pull request introduces 1 alert when merging 8e8bd9cf470c8aedf243928ae91c57fd9b2f970b into 473f17def1fe650609263af48c0b49250cec6309 - view on LGTM.com

new alerts:

  • 1 for Implicit function declaration

lgtm-com[bot] avatar Jun 08 '21 16:06 lgtm-com[bot]

@joholl I was referring to a custom fd > stderr, e.g. fd = createPipe(); setenv("TSS2_LOGFILE", "fd:%i" % fd);

Could you explain the weak symbol thing and how to use it here to me ? Is it something like a dlsym(dlopen(NULL), "mysymbol"); ? And is it C99 compliant ?

A weak symbol allows the linker to override it at link time. Any library following the declared weak symbol will override it. Weak symbols can be construed as a security issues, however if your linking it, you should ensure its a legit library. But beacuse of this I generally don't use weak symbols for that reason and portability.

Weak symbols are compiler attribute, they are not covered in C99. The only linking visibility covered in C, AFAIK, is static and extern storage. They also require support of the linker and file format. I think Mac's DYN supports them, but I don't think windows PE does; obviously ELF does.

williamcroberts avatar Jun 08 '21 16:06 williamcroberts

@joholl I'd vote against weak symbols then, because we are C99 everywhere else and we want to stay compatible with all those old compilers for embedded.

@williamcroberts I'd favor the more complex callbacks as well

The thing about this option 2 is that if the tss is used from two different places (e.g. main application and a sub-library) they might be fighting about the loghandler setting. But I guess this case is negligible. So I like the current approach in principle.

The other thing is that we cannot set the loghandler for tcti modules at all using this approach if we go through the tctildr. I guess tctildr needs to forward it loghandler to the tcti modules upon loading them.

This might be a reason for Option 1 instead ? If the application already has libtss2-logging.so loaded, the loaded tcti modules would bind against that instead of loading a new one, right ?

Also Option 1 would allow a single setting of the logger instead of having to set it for each library; In case of pkcs11 that would mean setting the logger 4 times (for tcti(ldr), for mu, for esys and for fapi).

AndreasFuchsTPM avatar Jun 09 '21 08:06 AndreasFuchsTPM

I agree that weak symbols are not the ideal solution in terms of portability.

@williamcroberts

That [Option 1.] would then break it and the logic as we need a non-shared lib here IIUC.

Why do we need a non-shared lib, here? Similar to the mu lib, we could just link against libtss2-log.so, couldn't we? That would have the advantage of having a single logging handler and a single callback setter, as pointed out by @AndreasFuchsSIT.

@AndreasFuchsSIT With option 2.: since we also support linking against TCTI modules directly (as opposed to using TctiLdr), we'd also need a loghandler setter for the TCTI modules. As you said, the mu library (and maybe rc-decode?) needs logging, as well, and therefore their own loghandlers.

This might be a reason for Option 1 instead ? If the application already has libtss2-logging.so loaded, the loaded tcti modules would bind against that instead of loading a new one, right ?

No, I think if a TCTI module is dlopen-ed and depends on libtss2-logging.so, it loads its own instance of libtss2-logging.so, see the man pages:

If the library has dependencies on other shared libraries, then these are also automatically loaded by the dynamic linker using the same rules. (This process may occur recursively, if those libraries in turn have dependencies, and so on.) ... External references in the library are resolved using the libraries in that library's dependency list and any other libraries previously opened with the RTLD_GLOBAL flag. If the executable was linked with the flag "-rdynamic" (or, synonymously, "--export-dynamic"), then the global symbols in the executable will also be used to resolve references in a dynamically loaded library.

But maybe if we dlopen libtss2-logging.so, it could work? I think this needs to be experimented with.

joholl avatar Jun 09 '21 15:06 joholl

I agree that weak symbols are not the ideal solution in terms of portability.

@williamcroberts

That [Option 1.] would then break it and the logic as we need a non-shared lib here IIUC.

Why do we need a non-shared lib, here? Similar to the mu lib, we could just link against libtss2-log.so, couldn't we? That would have the advantage of having a single logging handler and a single callback setter, as pointed out by @AndreasFuchsSIT.

I see what he was saying now, but you would only have one logging call back for all libraries, or have to instrument that library for all libraries that want a dedicated logger.

@AndreasFuchsSIT With option 2.: since we also support linking against TCTI modules directly (as opposed to using TctiLdr), we'd also need a loghandler setter for the TCTI modules. As you said, the mu library (and maybe rc-decode?) needs logging, as well, and therefore their own loghandlers.

This might be a reason for Option 1 instead ? If the application already has libtss2-logging.so loaded, the loaded tcti modules would bind against that instead of loading a new one, right ?

Yes, the dynamic linker looks through the DT_NEEDED section (IIRC the section name correctly) and if not loaded, loads them.

No, I think if a TCTI module is dlopen-ed and depends on libtss2-logging.so, it loads its own instance of libtss2-logging.so, see the man pages:

If the library has dependencies on other shared libraries, then these are also automatically loaded by the dynamic linker using the same rules. (This process may occur recursively, if those libraries in turn have dependencies, and so on.) ... External references in the library are resolved using the libraries in that library's dependency list and any other libraries previously opened with the RTLD_GLOBAL flag. If the executable was linked with the flag "-rdynamic" (or, synonymously, "--export-dynamic"), then the global symbols in the executable will also be used to resolve references in a dynamically loaded library.

But maybe if we dlopen libtss2-logging.so, it could work? I think this needs to be experimented with.

Dealing with needeing multiple instances comes back to the trick you used to deal with libtpms in https://github.com/tpm2-software/tpm2-tss/pull/1992.

williamcroberts avatar Jun 09 '21 15:06 williamcroberts

@williamcroberts I'd favor the more complex callbacks as well

The thing about this option 2 is that if the tss is used from two different places (e.g. main application and a sub-library) they might be fighting about the loghandler setting. But I guess this case is negligible. So I like the current approach in principle.

This has never been an issue with libselinux, so I doubt it will be one in practice. If someone does need to work around this they can use the dlopen trick in https://github.com/tpm2-software/tpm2-tss/pull/1992

The other thing is that we cannot set the loghandler for tcti modules at all using this approach if we go through the tctildr. I guess tctildr needs to forward it loghandler to the tcti modules upon loading them.

Yes good point.

This might be a reason for Option 1 instead ? If the application already has libtss2-logging.so loaded, the loaded tcti modules would bind against that instead of loading a new one, right ?

I see what your saying now where we would just have one logger for all things, which is fine by me. I just assumed that we would want granularity here, but that's not a hard requirement and could be dropped. ~~I think we would want to ensure that liblog is not considered a public library.~~ disregard that statement.

Also Option 1 would allow a single setting of the logger instead of having to set it for each library; In case of pkcs11 that would mean setting the logger 4 times (for tcti(ldr), for mu, for esys and for fapi).

Yeah I'm starting to come around to option 1.

williamcroberts avatar Jun 09 '21 15:06 williamcroberts

But maybe if we dlopen libtss2-logging.so, it could work? I think this needs to be experimented with.

Generally its not wise to dlopen things because then explicit dependencies cannot be discovered by interrogating the libraries dependencies in the DT_NEEDED section. IIRC, folks from Gentoo often gripe about this, something to do with their package manager. So I try to avoid dlopen unless a specific need arises.

williamcroberts avatar Jun 09 '21 19:06 williamcroberts

A new idea from Peter: We could have a new Initialize method that takes the logHandler as a parameter. That should also prevent the problem with an application with 2 submodules (e.g. openssl engine and direct access) using the tss and each trying to claim the loghandler.

AndreasFuchsTPM avatar Jun 15 '21 11:06 AndreasFuchsTPM

A new idea from Peter: We could have a new Initialize method that takes the logHandler as a parameter. That should also prevent the problem with an application with 2 submodules (e.g. openssl engine and direct access) using the tss and each trying to claim the loghandler.

Yeah I thought about that as well, was unsure if we could tolerate a change like that. But it would be a separate API so all existing applications just behave the same and if anyone wants this behavior they can migrate.

williamcroberts avatar Jun 15 '21 13:06 williamcroberts

A new idea from Peter: We could have a new Initialize method that takes the logHandler as a parameter. That should also prevent the problem with an application with 2 submodules (e.g. openssl engine and direct access) using the tss and each trying to claim the loghandler.

This seems not to work because a lot of the API that logs doesn't have a state associated with it, like libmu.

williamcroberts avatar Jul 07 '21 15:07 williamcroberts

A new idea from Peter: We could have a new Initialize method that takes the logHandler as a parameter. That should also prevent the problem with an application with 2 submodules (e.g. openssl engine and direct access) using the tss and each trying to claim the loghandler.

This seems not to work because a lot of the API that logs doesn't have a state associated with it, like libmu.

Agreed.

per-API global setters are fine with me.

P.S. Who shall override whom ? Function overrides Env-Parsing or other the other way ?

AndreasFuchsTPM avatar Aug 06 '21 08:08 AndreasFuchsTPM

I'd opt for env overwrites function because it is far easier to just change the env. var. than to change the code and recompile when debugging. The env mechanism should only be used for debugging, anyway, imo.

joholl avatar Aug 06 '21 14:08 joholl

The ENV var tweaks the level, and if the message is to be logged it goes to the callback registered by the application or to the default stderr stream.

williamcroberts avatar Aug 06 '21 18:08 williamcroberts

Sure. My expectation was that if we create a logging API with a setter for logging callbacks, we also add a call for setting the verbosity. I guess that was also what @AndreasFuchsSIT meant, right?

joholl avatar Aug 09 '21 09:08 joholl

Sure. My expectation was that if we create a logging API with a setter for logging callbacks, we also add a call for setting the verbosity. I guess that was also what @AndreasFuchsSIT meant, right?

Ahh " we also add a call for setting the verbosity", ahh ok. With the env var do we really need a set log level function? When would we use that over the env var? If we had something like a --debug command line option we could just add to env.

williamcroberts avatar Aug 09 '21 14:08 williamcroberts