ctypeslib Improve support for macros

This change improves support for float literal and int literals in macros and most importantly adds support for function-like macros !

This is probably a breaking change.

Things seems to work pretty well (at least according to my expectations) with clang 11 but with clang 6 test_macro.py seems to enter an infinite recursion.

Mar 15 '21 16:03 ndessart

I'm getting some infinite recursion on test_callback too

Mar 15 '21 22:03 trolldbois

Dynamic evaluation of function-like macros has a lot of implications and I don't have enough hindsight to determine : a) what could reasonably be supported by ctypeslib b) what could be the actual use cases beyond bitwise operations and function parameters binding/reordering.

Initially, I just wanted to support macros that expand to constant literals (int and float). Currently something as simple as

#include <stdint.h>
#define FOO UINT64_MAX

fails because the macro UINT64_MAX defined by glibc stdint.h involves intermediate function-like macros.

Are you proposing to handle C functions next ? Emulation of C code in python ?

No, my objective was to follow the principle of least surprise so trying my best to support macros that look "simple enough". I don't plan to open Pandora's box nor do I plan to implement a C interpreter in Python

My only real world use case (so far) for dynamic evaluation of function-like macros is to debug ctypeslib macro expansion (while debugging ctypeslib itself or the Python binding it generates). Other than that I don't think my bindings would need anything more than a working build-time constant folding macro expansion.

While implementing function-like macro support, I've switched multiple times back and forth between partial macro expansion and full constant folding until I realized that partial macro expansion (/dynamic macros) is easier to debug.

I should probably add a command line option to activate partial macro expansion, what do you think ?

That being said, I have to admit that I was also a bit curious to see what could reasonably be achieved with dynamic function-like macros. So far my answer would be: not very much beyond simple bit mask / bitwise operations

            #define FOO(foo) (foo & 0x0FFFF)

and function parameter binding/reordering :

            #define FOO(...) ("foo", __VA_ARGS__, "bar")
            #define BAR(a, b, c) FOO(c, b, a)

(See test/test_macro_advanced.py)

Mar 16 '21 10:03 ndessart

Note: My head is hurting at thinking of ramifications.

So there is what i'm comfortable seeing/my guidelines for ctypeslib for now. Still thinking of what that means.

Default behavior generated code should be simple, and easily readable. Like a simple C to py translation of that file. One C input, one py module.
Generated code should contain python variables, variables values and records type definition that are compatible with C
(thanks to your previous pull request) Generated code should contain C function bindings that are easily usable
Generated code should try to support easily translated values for object macro, just because they are easy enough and practical.
Generated code should not try to translate/emulate all code, or try to hard to be a preprocessor
Generated code should not contain "too much" supporting code
Generated code should only contains definition required to the C input (and not all the include's definitions) (by default behavior)

Point 1 potentially opens the door to a more complex framework, a bit like CFFI, where ctypeslib has to be present to execute the generated code ? the stdint.h headers could get specialized treatment if one absolutely wants to use the MACRO from that headers in some python code Something like

# normal stuff ...
generator = codegen.Generator(input_io)
generator.generate(parser, items)

# but we absolutely want a macro value
from ctypeslib.utils import macro_advanced_support
macro_namespace = macro_advanced_support('stdint.h')
# or macro_namespace = macro_advanced_support('my_file.c')
print(macro_namespace.UINT64_MAX)  # 18446744073709551615

So yeah, in short, I think I'm coming to like your code here, but let's take it in another direction.

Make it super optional for default behavior with a CLI arg
Make it pass all the tests? I will activate the pull request github actions

Mar 16 '21 22:03 trolldbois

Make it super optional for default behavior with a CLI arg

Make it pass all the tests? I will activate the pull request github actions

I've updated this pull request, I've reworked it and things should be better now...

Bonus:

[x] I think I've fixed some unrelated bugs and performance issues

Mar 24 '21 19:03 ndessart

I've just rebased my pull request and fixed a libclang version parsing issue.

Make it super optional for default behavior with a CLI arg

I've added a "--advanced-macro" cli flag that activates the generation of function-like macro. However the default behavior of the -m flag has changed with this PR because now ctypeslib will try its best to perform a constant folding of every function-like macro appearing in the definition of a to-be-generated (non-funcion-like) macro. In this case, the generated output doesn't require codegen/preprocess.py while with the advanced macro activated, this file is inlined in the generated output.

Make it pass all the tests? I will activate the pull request github actions

I know the tests results look bad right now but locally with the latest changes, I pass every test with clang 11 / python 3.9.1 in my Linux virtual environment. I don't know why the github actions get cancelled and/or fail because test-callbacks.so is not found... I'll try to investigate.

I'm open to any advice or pointers you may have on this PR or on how to fix the github actions.

I'd also be happy to answer any question you may have on this PR but first I think the "[DEV] improve performances by using some caching" commit deserves some explanations.

When I was debugging my code, I've encountered performance issues and/or what appeared to be "infinite recursions" so I've profiled ctypeslib and tried to cache/memoize the functions that where called the most and/or appeared to be libclang bottleneck after a few trial and error I've ended with the following cached functions (see: ctypeslib/codege/cache.py:_cache_functions) :

    "ctypeslib.codegen.cindex.Cursor.get_tokens",
    "ctypeslib.codegen.cindex.SourceLocation.__contains__",
    "ctypeslib.codegen.cindex.Token.cursor",

Other functions of ctypeslib or libclang don't seemed to be worth caching (the codegen.cache.cached* decorators don't enable caching unless the function name is in the ctypeslib/codege/cache.py:_cache_functions list).

While I was implementing the cache feature, I've replaced every "print" call in codegenerator so I've decided to kill two birds with one stone and to replace the old-style % string formatting by (more performant) f-strings.

Finally, in this cache/performance commit, I've also reworked the ClangParser.all and ClangParser.all_set attributes because it caused a caching bug.

Mar 29 '21 13:03 ndessart

thinking about it ...

Apr 07 '21 03:04 trolldbois