secp256k1 icon indicating copy to clipboard operation
secp256k1 copied to clipboard

Add `contrib/symbol-check.py`

Open hebasto opened this issue 3 years ago • 4 comments

The new contrib/symbol-check.py aims to ensure that only expected symbols are exported in the secp256k1 shared library.

The script itself stems from Bitcoin Core's contrib/devtools/symbol-check.py and bitcoin/bitcoin#25020. It uses the LIEF library.

Useful to ensure that build system changes (including bitcoin-core/secp256k1#1113) will not introduce any unexpected regressions.

hebasto avatar Aug 22 '22 14:08 hebasto

I think checking that not too many symbols are exported could be interesting for us.

Checking libraries not so much, we anyway don't have dependencies, so all linkage should be introduced by the compiler. Or what do you think?

real-or-random avatar Aug 22 '22 17:08 real-or-random

Checking libraries not so much, we anyway don't have dependencies

Cleaned up, improved, undrafted.

hebasto avatar Aug 23 '22 13:08 hebasto

Now that I think about this, I wonder why that would be necessary. I've never seen any libraries doing this. I think in our case it could be helpful to prevent mistakes where we forget to make an internal function static and then it would be exported?

But on the other hand, it's yet another file that needs to be maintained. In particular the list of function must be kept in sync with the include files. This violates the DRY principle. Do you think the list could be generated on-the-fly, maybe just using some clever grep SECP256K1_API magic? A more elaborate version would be to use clang-query or similar, see #833, but I think this would be overkill for now.

Weak Concept ACK because it has potential to prevent mistakes where we forget static

real-or-random avatar Aug 23 '22 20:08 real-or-random

I wonder why that would be necessary.

Because quality matters :) For example:

However, generally it is not recommended to export all the symbols in modules. Programmers can just export symbols as needed. This does not only benefit library security, but also benefits dynamic linking time.

I've never seen any libraries doing this.

https://github.com/bitcoin/bitcoin/pull/25020#issuecomment-1142151675

I think in our case it could be helpful to prevent mistakes where we forget to make an internal function static and then it would be exported?

The secp256k1_pre_g is not static (in most cases). Nevertheless, its export must be avoided.

Other use cases:

  • it has helped to avoid regression in #1113
  • a sanity check of linker flags when building without any build system
  • when a build system being changed or modified
  • with future use of CMake, its new version could potentially introduce a regression into its linking code

maybe just using some clever grep SECP256K1_API magic?

:+1:

hebasto avatar Aug 23 '22 21:08 hebasto

maybe just using some clever grep SECP256K1_API magic?

That would be neat.

sipa avatar Nov 21 '22 21:11 sipa

Updated 8ae0d514b481a2532ed86e272d6da492bee6b78a -> 766964f023a0fad52f23df742f8d5802b3890e70 (pr1135.02 -> pr1135.03, diff).

maybe just using some clever grep SECP256K1_API magic?

That would be neat.

Done.

hebasto avatar Nov 30 '22 15:11 hebasto

This fails for me with

> ./contrib/symbol-check.py ./.libs/libsecp256k1.so
./.libs/libsecp256k1.so: symbol memcpy from unsupported version GLIBC_2.14(4)
./.libs/libsecp256k1.so: failed IMPORTED_SYMBOLS

real-or-random avatar Dec 02 '22 13:12 real-or-random

This fails for me with

What system/compiler on?

UPD. I assume, it is a clang...

hebasto avatar Dec 02 '22 14:12 hebasto

llibsecp256k1, unlike Bitcoin Core, has no affordances to make sure the produced builds are compatible with certain, specific, version of glibc on the user's system.

I think the max version checks should just be dropped here.

sipa avatar Dec 02 '22 14:12 sipa

Updated 766964f023a0fad52f23df742f8d5802b3890e70 -> 6a97f7296342d1840b360ac18a922f6351fe06a0 (pr1135.03 -> pr1135.04, diff).

I think the max version checks should just be dropped here.

Done.

hebasto avatar Dec 02 '22 14:12 hebasto

llibsecp256k1, unlike Bitcoin Core, has no affordances to make sure the produced builds are compatible with certain, specific, version of glibc on the user's system.

Ok yes, I was wondering what the purpose of the max version is... :)

real-or-random avatar Dec 02 '22 14:12 real-or-random

Bikeshedding: I believe contrib is the wrong directory for this, see for example this attempt at a definition: https://softwareengineering.stackexchange.com/questions/252053/whats-in-the-contrib-folder#339925.

I know, Bitcoin Core uses a different definition. My point is that the current contents of the folder fit the definition on stackexchange, so we shouldn't mix up the two.

I think util or tools is good a name. I wouldn't mind putting it in the root either.

real-or-random avatar Dec 02 '22 15:12 real-or-random

Updated 6a97f7296342d1840b360ac18a922f6351fe06a0 -> 6813bee4aacf5fee228ae8a0eaae5a5272099c03 (pr1135.04 -> pr1135.05, diff).

I think util or tools is good a name. I wouldn't mind putting it in the root either.

Moved into the tools directory.

hebasto avatar Dec 02 '22 15:12 hebasto

It's not clear to me that adding this script is worth it (especially as it continue to shrink). We seem to be adding a new language dependency, and third party library dependency, to run grep + objdump? I might have thought a bash script would be a better fit for libsecp256k1, especially since this is basically just a sanity check that will live in the CI (as the API rarely changes, and I don't think anyone will run this locally)?

Is there a reason this doesn't support macOS shared libs? LIEF supports them as far as I'm aware, and it seems odd to add a platform agnostic binary parser as a dependency, and then not support all libraries that are being built.

fanquake avatar Dec 02 '22 15:12 fanquake

It's not clear to me that adding this script is worth it (especially as it continue to shrink). We seem to be adding a new language dependency, and third party library dependency, to run grep + objdump? I might have thought a bash script would be a better fit for libsecp256k1, especially since this is basically just a sanity check that will live in the CI

I think @hebasto convinced me above that this tool has >0 benefit, but indeed, it introduces a maintenance burden and doing the same with objdump instead of Python+LIEF might reduce this burden (but I don't know -- for example I don't know if objdump exists on macOS).

Note that we already have python as a CI dependency for our sage scripts.

real-or-random avatar Dec 02 '22 15:12 real-or-random

(but I don't know -- for example I don't know if objdump exists on macOS).

It does, but at the moment, I guess that doesn't matter, given the script doesn't check macOS libs.

Would you expect an API issue to be present in one shared library, but not all others? We might be able to get away with just sanity checking an ELF lib, as I assume if there's an issue that you'd want to catch, present there, it'll be present everywhere.

fanquake avatar Dec 02 '22 15:12 fanquake

Is there a reason this doesn't support macOS shared libs?

Updated to support macOS libs. LIEF v0.13.0 (not released yet) is required.

Also: https://lief-project.github.io/blog/2022-05-08-macho/

hebasto avatar Dec 04 '22 09:12 hebasto

LIEF v0.13.0 (not released yet) is required.

Looks like you could do something like this to check that the expected exports are present? Works with LIEF 0.12.3 (latest release):

diff --git a/tools/symbol-check.py b/tools/symbol-check.py
index 8206b95..7237158 100755
--- a/tools/symbol-check.py
+++ b/tools/symbol-check.py
@@ -59,12 +59,10 @@ def check_MACHO_exported_functions(library, expected_exports) -> bool:
     ok: bool = True
     macho_lib: lief.MACHO.Binary = library.concrete
 
-    for function in macho_lib.exported_functions:
-        name: str = function.name[1:]
-        if name in expected_exports:
-            continue
-        print(f'{filename}: export of function {name} not expected')
-        ok = False
+    for name in expected_exports:
+        if not macho_lib.has_symbol(name):
+            print(f'{filename}: export of function {name} missing')
+            ok = False

fanquake avatar Dec 04 '22 15:12 fanquake

Updated 6e6aa4915eb5e984928bbb8b675cf55fd5e67879 -> cc7933d47d70e93ef578b01f7a3336cf0aeaaaa7 (pr1135.06 -> pr1135.07):

  • rebased
  • addressed @real-or-random's comment

hebasto avatar Dec 13 '22 15:12 hebasto

We've discussed this at length in the meeting today, see https://gnusha.org/secp256k1/2022-12-19.log for details.

Conclusion:

  • [ ] @hebasto will try to incorporate the suggestion from @fanquake to make this work on current LIEF functions
  • [ ] @hebasto will add commits that call the script in CI, and maybe add builds with -fvisibility=default

real-or-random avatar Dec 19 '22 16:12 real-or-random

  • @hebasto will try to incorporate the suggestion from @fanquake to make this work on current LIEF functions

LIEF v0.13.0 (not released yet) is required.

Looks like you could do something like this to check that the expected exports are present? Works with LIEF 0.12.3 (latest release):

diff --git a/tools/symbol-check.py b/tools/symbol-check.py
index 8206b95..7237158 100755
--- a/tools/symbol-check.py
+++ b/tools/symbol-check.py
@@ -59,12 +59,10 @@ def check_MACHO_exported_functions(library, expected_exports) -> bool:
     ok: bool = True
     macho_lib: lief.MACHO.Binary = library.concrete
 
-    for function in macho_lib.exported_functions:
-        name: str = function.name[1:]
-        if name in expected_exports:
-            continue
-        print(f'{filename}: export of function {name} not expected')
-        ok = False
+    for name in expected_exports:
+        if not macho_lib.has_symbol(name):
+            print(f'{filename}: export of function {name} missing')
+            ok = False

As the suggested approach changes the check logic, the script fails if any of modules are disabled. For example, the "recovery" module is disabled by default, and script ends with:

.libs/libsecp256k1.1.dylib: export of function secp256k1_ecdsa_recoverable_signature_parse_compact missing
.libs/libsecp256k1.1.dylib: export of function secp256k1_ecdsa_recoverable_signature_convert missing
.libs/libsecp256k1.1.dylib: export of function secp256k1_ecdsa_recoverable_signature_serialize_compact missing
.libs/libsecp256k1.1.dylib: export of function secp256k1_ecdsa_sign_recoverable missing
.libs/libsecp256k1.1.dylib: export of function secp256k1_ecdsa_recover missing
.libs/libsecp256k1.1.dylib: failed EXPORTED_FUNCTIONS

Making this PR a draft for now.

hebasto avatar Dec 20 '22 12:12 hebasto