tpm2-tss icon indicating copy to clipboard operation
tpm2-tss copied to clipboard

TCTI: Add generic platform independent SPI TCTI module

Open jpxd opened this issue 4 years ago • 31 comments

This PR adds a new SPI TCTI module for TPM communication over SPI e.g. in embedded systems like the ESP32.

The SPI TCTI uses user supplied methods for SPI and timing operations in order to be platform independent. These methods are supplied to Tss2_Tcti_Spi_Init via the TSS2_TCTI_SPI_PLATFORM struct.

Example code for using the SPI TCTI on an ESP32 can be found in doc/tcti-spi.md.

jpxd avatar Oct 01 '20 12:10 jpxd

Thanks for the feedback @AndreasFuchsSIT!

I will add more documentation as requested.

Where should I put the example code for the ESP32 platform funcs & wiring information? Should I force-push my combined changes as a single commit or should I add new commits to this PR?

jpxd avatar Oct 01 '20 14:10 jpxd

I'd just put some example code snippets into the documentation, nothing extra and no complete program needed.

Then you can just force-push.

AndreasFuchsTPM avatar Oct 01 '20 14:10 AndreasFuchsTPM

Awesome, that looks very good. I just noticed however that we could even go ahead and compile and unit-test this on regular x86 linux platforms, right ? I think we should do so. I guess adding the library as noinst_LIBRARIES to the makefile, guarded by a if TEST and then a unit-test that links against the library, so we know that it works in the future.

AndreasFuchsTPM avatar Oct 02 '20 10:10 AndreasFuchsTPM

Maybe in the future, we could make it a inst LIBRARY because a ftdi-usb-spi tcti would also be kind of cool... ;-)

AndreasFuchsTPM avatar Oct 02 '20 10:10 AndreasFuchsTPM

P.S. I'd add the makefile integration for the library in the current commit and the unit-test into an extra commit.

AndreasFuchsTPM avatar Oct 02 '20 11:10 AndreasFuchsTPM

Very cool proposal - i hope I can find time to review it soon.

PeterHuewe avatar Oct 06 '20 18:10 PeterHuewe

@AndreasFuchsSIT : Is it possible to run the device tests against this tcti ? This would be a nice way for testing

PeterHuewe avatar Oct 06 '20 18:10 PeterHuewe

There are some scan-build error in there (https://travis-ci.org/github/tpm2-software/tpm2-tss/jobs/733492007#L1697):

../../src/tss2-tcti/tcti-spi.c:215:5: warning: Undefined or garbage value returned to caller
    return status;
    ^~~~~~~~~~~~~
../../src/tss2-tcti/tcti-spi.c:233:5: warning: Undefined or garbage value returned to caller
    return access;
    ^~~~~~~~~~~~~
../../src/tss2-tcti/tcti-spi.c:565:21: warning: The left operand of '!=' is a garbage value
        if (did_vid != 0) break;
            ~~~~~~~ ^
../../src/tss2-tcti/tcti-spi.c:590:14: warning: Assigned value is garbage or undefined
    revision = rid;
             ^ ~~~

You can run make clean; scan-build make locally to get some more details.

AndreasFuchsTPM avatar Oct 13 '20 12:10 AndreasFuchsTPM

There are some scan-build error in there (https://travis-ci.org/github/tpm2-software/tpm2-tss/jobs/733492007#L1697):

@AndreasFuchsSIT These are fixed by now. Do you have any idea why the gcc-fedora builds are failing in the CI but not the clang-fedora builds?

jpxd avatar Oct 14 '20 21:10 jpxd

It's the addresssanitizer that's detecting something. No clue what it means though

ERROR: test/integration/fapi-quote-destructive
==============================================
Trying to start simulator swtpm
Starting simulator on port 62652
successfully started daemon: swtpm with PID: 31835
/workspace/tpm2-tss/build
simulator PID: 31835
LISTEN 0      1          127.0.0.1:62652      0.0.0.0:*    users:(("swtpm",pid=31835,fd=4))
LISTEN 0      1          127.0.0.1:62653      0.0.0.0:*    users:(("swtpm",pid=31835,fd=3))
Simulator with PID 31835 bound to port 62652 and  62653 successfully.
TPM20TEST_TCTI_NAME=swtpm
TPM20TEST_DEVICE_FILE=
TPM20TEST_SOCKET_ADDRESS=127.0.0.1
TPM20TEST_SOCKET_PORT=62652
TPM20TEST_TCTI=swtpm:host=127.0.0.1,port=62652
=================================================================
==2209==ERROR: AddressSanitizer: odr-violation (0x7fc910b8bd00):
  [1] size=40 'tss2_tcti_info' ../src/tss2-tcti/tcti-cmd.c:623:22
  [2] size=40 'tss2_tcti_info' ../src/tss2-tcti/tcti-spi.c:606:22
These globals were registered at these points:
  [1]:
    #0 0x7fc910f30cc8  (/lib64/libasan.so.6+0x37cc8)
    #1 0x7fc910b7bcae in _sub_I_00099_3 (/workspace/tpm2-tss/build/src/tss2-tcti/.libs/libtss2-tcti-cmd.so.0+0x1acae)
    #2 0x7fc9118e77e1 in call_init.part.0 (/lib64/ld-linux-x86-64.so.2+0x117e1)
  [2]:
    #0 0x7fc910f30cc8  (/lib64/libasan.so.6+0x37cc8)
    #1 0x7fc910b3cf90 in _sub_I_00099_3 (/workspace/tpm2-tss/build/src/tss2-tcti/.libs/libtss2-tcti-spi.so.0+0x18f90)
    #2 0x7fc9118e77e1 in call_init.part.0 (/lib64/ld-linux-x86-64.so.2+0x117e1)
==2209==HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0
SUMMARY: AddressSanitizer: odr-violation: global 'tss2_tcti_info' at ../src/tss2-tcti/tcti-cmd.c:623:22
==2209==ABORTING
TPM_StartUp failed
ERROR test/integration/fapi-quote-destructive.fint (exit status: 99)

AndreasFuchsTPM avatar Oct 15 '20 08:10 AndreasFuchsTPM

odr-violation - one definition rule - it complains that tcti-cmd and tcti-spi both define the same global tss2_tcti_info, (as do all other tcti's.) It might be the case (one would have to check) that for whatever reason tcti-spi pulls in tcti-cmd within the same compilation unit.

PeterHuewe avatar Oct 15 '20 08:10 PeterHuewe

the tss2_tcti_info should be static anyway.

PeterHuewe avatar Oct 15 '20 08:10 PeterHuewe

the tss2_tcti_info should be static anyway.

What do you mean with this @PeterHuewe ?

It might be the case (one would have to check) that for whatever reason tcti-spi pulls in tcti-cmd within the same compilation unit.

I could not find why this happens (and only on these two fedora builds?!)... Any hints?

jpxd avatar Oct 20 '20 12:10 jpxd

the tss2_tcti_info should be static anyway.

What do you mean with this @PeterHuewe ?

The variable can be declared as static because it is only ever used inside that same C-file.

It might be the case (one would have to check) that for whatever reason tcti-spi pulls in tcti-cmd within the same compilation unit.

I could not find why this happens (and only on these two fedora builds?!)... Any hints?

Not all builds run the ASAN analyzer or not even in the same version. That's why it fails on those two boxes.

AndreasFuchsTPM avatar Oct 20 '20 14:10 AndreasFuchsTPM

I wonder if this should handle being able to add in the TIS commands for communicating over SPI?

@iolivergithub any input?

williamcroberts avatar Oct 28 '20 18:10 williamcroberts

@jpxd Any progress on the unit test ?

AndreasFuchsTPM avatar Nov 18 '20 18:11 AndreasFuchsTPM

@AndreasFuchsSIT I already started writing unit tests for the SPI-TCTI using a fake-SPI interface with cmocka, but had to pause that because I currently have to finish my bachelor's thesis. After that I can continue the work on the unit tests.

A few weeks ago I sent @PeterHuewe some ESP32 example projects, so he can also test the SPI-TCTI on his side.

About the linker / address sanitizer problem in the CI: I just could not replicate the error on my local machine running the unit tests with the sanitizer enabled... Any help here is greatly appreciated. You said "the tss2_tcti_info should be static anyway", but I think changing the tss2_tcti_info to static in ALL TCTCIs is out of scope for this PR.

jpxd avatar Nov 23 '20 09:11 jpxd

If course, do your stuff. If you want, you can switch this PR to "draft" status and post your intermediate versions as well.

Regarding the static: I guess making your info file static is enough for this PR, I agree.

Regarding asan: best is to use the scripts from the .ci folder directly to test. Of course if you have an older version of asan (Fedora is pretty much latest usually), then there might be errors that you cannot detect locally. A docker container (as done by the CI scripts) helps here.

AndreasFuchsTPM avatar Nov 23 '20 12:11 AndreasFuchsTPM

Hmm, what if there is no way to hold the CS over multiple spi transfers ? (many spi masters have that limitation) and one does not want or cannot use a gpio. For certain tpms (e.g. ifx) it would be possible to ignore the waitstate handling as no waitstates are generated in normal circumstances (and thus avoid the limitiation of certain masters)

Would this be somehow possible (e.g. if the acquire function is NULL)?

PeterHuewe avatar Nov 28 '20 02:11 PeterHuewe

Hi, since we have a TSS2_TCTI_SPI_PLATFORM_FINALIZE finalize; does it make sense to also include a TSS2_TCTI_SPI_PLATFORM_INITIALIZE initialize function ? to include the posiblility for a one-time setup (a little bit like the initial part of your create_tcti_ctx function)

PeterHuewe avatar Nov 28 '20 23:11 PeterHuewe

@PeterHuewe

Hmm, what if there is no way to hold the CS over multiple spi transfers ? (many spi masters have that limitation) and one does not want or cannot use a gpio. For certain tpms (e.g. ifx) it would be possible to ignore the waitstate handling as no waitstates are generated in normal circumstances (and thus avoid the limitiation of certain masters)

Would this be somehow possible (e.g. if the acquire function is NULL)?

I see. The wolfTPM library by wolfSSL solves this by using only one SPI-callback function which is always responsible for whole transactions (see the wolfTPM example). In my SPI-TCTI the transaction / wait state logic is hidden from the implemented "dumb" IO-methods. But this way I need multiple calls to the transfer method without loosing the CS, which leads to the extra acquire and release functions.

Doing it the wolfTPM-way, the application needs to decide for themselves, if they need to implement wait state handing. This can be good, but is also added complexity.

If we used an approach with setting the acquire and release functions to NULL in some cases, what would the expected transaction logic be? Assume no waitstates at all? This would need to be documented very well.

jpxd avatar Feb 01 '21 14:02 jpxd

Codecov Report

Merging #1862 (a2c3275) into master (29578bb) will decrease coverage by 0.36%. The diff coverage is 28.80%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1862      +/-   ##
==========================================
- Coverage   83.80%   83.43%   -0.37%     
==========================================
  Files         343      344       +1     
  Lines       39657    40228     +571     
==========================================
+ Hits        33236    33566     +330     
- Misses       6421     6662     +241     
Impacted Files Coverage Δ
src/tss2-tcti/tcti-spi.c 0.00% <0.00%> (ø)
src/tss2-fapi/ifapi_policy_json_deserialize.c 83.18% <76.00%> (+1.89%) :arrow_up:
src/tss2-fapi/tpm_json_deserialize.c 81.94% <82.05%> (+0.42%) :arrow_up:
src/tss2-fapi/ifapi_json_deserialize.c 80.59% <89.36%> (+0.22%) :arrow_up:
src/tss2-fapi/ifapi_policy_execute.c 90.83% <0.00%> (-0.58%) :arrow_down:
src/tss2-fapi/api/Fapi_Provision.c 86.20% <0.00%> (-0.21%) :arrow_down:
src/tss2-fapi/ifapi_policy_callbacks.c 86.91% <0.00%> (-0.12%) :arrow_down:
src/tss2-fapi/fapi_crypto.c 76.47% <0.00%> (-0.07%) :arrow_down:
... and 22 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 9ef80dd...a2c3275. Read the comment docs.

codecov[bot] avatar Feb 01 '21 15:02 codecov[bot]

@jpxd any updates on this one? I would really like to see this feature to be available.

PeterHuewe avatar May 27 '21 09:05 PeterHuewe

Hi, as far as I can see, there were three points still open:

  1. wait state handling
  2. unit tests
  3. ASAN error

To point 1 (wait states): Here I was still waiting for feedback on how to proceed. As written in my last post there are two or even three variants:

Variant 1 The TSS2_TCTI_SPI_PLATFORM struct remains as it is, but we allow to disable the wait-state handling by passing NULL for the spi_acquire and spi_release functions.

Variant 2a We follow wolfTPM's example: There would then be only one spi_transfer_transaction method in TSS2_TCTI_SPI_PLATFORM to be implemented by the user, which is called only once per register transaction. This method must take care of any wait states itself (e.g. with a GPIO-CS). In the examples I would include two sample codes with and without wait-state handling. The methods spi_acquire and spi_release are omitted from TSS2_TCTI_SPI_PLATFORM.

Variant 2b In addition to variant 2a, we could streamline the interface even further by omitting the user-implemented timing methods (analogous to wolfTPM) and removing TSS2_TCTI_SPI_PLATFORM completely. Timing is then controlled via fixed retry count and via optionally-definable sleep method (as in wolfTPM, see here and here). The Init method would then have the following signature without TSS2_TCTI_SPI_PLATFORM:

TSS2_RC Tss2_Tcti_Spi_Init (TSS2_TCTI_CONTEXT *tctiContext, size_t *size,
    TSS2_TCTI_SPI_PLATFORM_SPI_TRANSACTION_TRANSFER_FUNC spi_transaction_transfer, void *user_ctx);

I think variant 2a or 2b would be nicest. But in the end I don't care which of the variants it will be and therefore need feedback what is best / wanted here (@AndreasFuchsSIT @PeterHuewe). When this is decided, I can do the necessary changes.

To point 2 (unit tests): I had already started to write unit tests that fake the SPI interface with cmocka, but this has to wait for the final decision on the user implemented spy methods, which have to bee mocked (see point one). I'll try to continue working with the limited time I have for this and can commit my intermediate state to the PR. Would it be theoretically possible to accept the PR without unit tests and handle the unit tests in a separate PR @AndreasFuchsSIT ?

Regarding point 3 (ASAN error): Since I could not reproduce the ASAN problem from the Github-CI locally with Docker, I am dependent on assistance here. Could you maybe look into this @PeterHuewe ?

jpxd avatar Jun 01 '21 14:06 jpxd

With my VERY VERY basic understanding of the issue, I have a tendency for Variante 1. This serves the purpose of being a most flexible driver framework for the most use cases.

Maybe we could also allow the timeout-callbacks to be NULL and the implementation would then attempt a max-retries variant with e.g. 5 retries ? Or with timeout divided by 50ms or something.

AndreasFuchsTPM avatar Jun 07 '21 11:06 AndreasFuchsTPM

@PeterHuewe Any comments on the proposed variants? If not, I can start working on variant 1.

@AndreasFuchsSIT Yes, we can also allow disabling the timeout functions and fallback to a retry count.

jpxd avatar Jun 17 '21 11:06 jpxd

@jpxd Sorry I somehow missed that - variant 1 sounds fine.

PeterHuewe avatar Sep 10 '21 11:09 PeterHuewe

@jpxd Are you still planning to update this ? Anyone else who is / wants to be working on this ?

AndreasFuchsTPM avatar Jun 14 '22 10:06 AndreasFuchsTPM

Sorry for the reference spam, not a good idea to add PR reference in commit message. I'll pick up what @jpxd had left off, a new PR is created https://github.com/tpm2-software/tpm2-tss/pull/2397

wxleong avatar Jul 25 '22 08:07 wxleong

Hi all, unfortunately I currently don't have enough time to work on this project and get the PR over the finish line.

@wxleong Thank you a lot for continuing the work on this!!!

jpxd avatar Jul 26 '22 05:07 jpxd