ccl icon indicating copy to clipboard operation
ccl copied to clipboard

Update or replace ffigen4

Open xrme opened this issue 8 years ago • 80 comments

The interface databases that CCL uses are generated by a program called ffigen4. It is a set of patches to gcc-4.0.0 (see http://svn.clozure.com/publicsvn/ffigen4/)

These patches should be brought up-to-date. Alternatively, it might be an option to replace ffigen4 with some other tool. https://github.com/rpav/c2ffi might be suitable.

xrme avatar Feb 18 '17 18:02 xrme

What about just using CFFI? Or would that be insufficient?

eschaton avatar Feb 18 '17 21:02 eschaton

The interface databases make the #_ and #$ reader macros work. These reader macros are used extensively in the implementation of CCL itself.

I'm not a CFFI user, so I'm not really qualified to say whether it is nicer than CCL's native FFI, but I can say that I think that CCL's native FFI is a great feature.

xrme avatar Feb 18 '17 21:02 xrme

I know that CFFI is only a portable layer. Like what bordeaux-threads do with CCL's multiprocessing. So it may be not appropriate to use CFFI here,since we don't need to use the interface database in other CL and also Clozure's FFI provide more functionality. I suggest to update ffigen, because writing a new backend for c2ffi may only interested for developing CCL itself, library authors usually use CFFI other than platform specific ones.

ailisp avatar Mar 11 '17 14:03 ailisp

I agree that CCL's native FFI is a great feature. But unfortunately, rarely projects build upon it, instead they actually build upon CFFI. so I made a little project ccl-cffi to use same function interface as CFFI, but implement it upon CCL more efficiently. In this way, my programs runs more efficiently, and I could still use other packages depend on CFFI.

ghost avatar May 30 '17 11:05 ghost

I would like to work on this. Today I took a look at ffigen, and CCL's ffi doc. Looks not good to always patch gcc to build ffigen. Using https://github.com/rpav/c2ffi is a good idea. Here I can either

  1. add a driver for c2ffi to generate ffi format or
  2. replace lib/parse-ffi.lisp with a new lisp program that input c2ffi's sexp output and output cdb. @xrme Which way you prefer? Approach 2 is using lisp, of course more fun than c++ :) Thanks!

ailisp avatar Dec 27 '17 02:12 ailisp

You are welcome to work on this if you want to, but I worry that it is a rather big project.

I agree that we should try out https://github.com/rpav/c2ffi. If we can parse a simple header file with c2ffi and convince ourselves that the output matches (well, is isomorphic to) the current ffigen output, then that will give us some confidence that we can make it work.

I don't think we can completely replace parse-ffi.lisp. But I see no problem with writing (in Lisp or whatever) some program that will reformat c2ffi's output (either json or sexp) into the s-expression style ffigen format that parse-ffi.lisp knows how to process. If we find that c2ffi is working for us, we can consider writing a c2ffi driver in C++ at a later time.

My only reservation about c2ffi is that it uses an unstable (if not private) API to clang. There is a library called libclang. It provides a stable, C-based API. When I last looked at it, I didn't see how libclang dealt with C preprocessor content.

It would be great if we could use libclang for the interface translator, but maybe this is either not possible, or too much work.

If you are feeling up to the task of investigating this, then great! Thank you and good luck to you. I'll help you any way I can. If you spend some time on it and decide that it is too much trouble, I will certainly understand that, too.

xrme avatar Dec 27 '17 05:12 xrme

@xrme Thanks for the detailed observation. I need to study whether c2ffi generates isomorphic to ffigen after get ffigen4 works and try to compare their output. It's a bit difficult to get a working gcc 4.0 in current environment, but it's easier to do that in an old vm. But I will first try to patch current gcc (7.2) and if this is done, at least we have a modern ffigen4 and could compare its output with c2ffi.

c2ffi looks "relative" stable as it just updates for new llvm version and didn't change the example output json for 4 years: https://github.com/rpav/c2ffi/blame/llvm-5.0.0/README.md But I'll contact Ryan Pavlik to see if it's API is stable (after make sure the output is isomorphic).

As for libclang, I did some search, and there's a new flag to use C preprocessor: https://stackoverflow.com/questions/13881506/retrieve-information-about-pre-processor-directives I agree with you it would be much amount of work to use libclang. libclang is interesting and I would like to learn it but it takes some time.

ailisp avatar Dec 29 '17 06:12 ailisp

If you haven't already, it may be helpful to consult https://trac.clozure.com/ccl/wiki/BuildFFIGEN and also https://trac.clozure.com/ccl/wiki/CustomFramework

In particular, there's an Mac-specific ffigen branch. I don't know if it builds on an up-to-date system. I have an ffigen binary that works.

Also see http://svn.clozure.com/publicsvn/ffigen4/ (in particular the branches/ directory)

xrme avatar Dec 29 '17 07:12 xrme

Thanks for these guides. Today I tried to build it on archlinux, but the gcc-4.0's makefile doesn't work for gcc-7.2. So I tried to build it in a Fedora 4 vm, which has exactly a gcc-4.0.0. The build is almost automatic, except I need to give objc-act.c's position to patch it. And I did a compare with c2ffi's generation: input:

#define FOO (1 << 2)

const int BAR = FOO + 10;

typedef struct my_point {
    int x;
    int y;
    int odd_value[BAR + 1];
} my_point_t;

enum some_values {
    a_value,
    another_value,
    yet_another_value
};

void do_something(my_point_t *p, int x, int y);

c2ffi's output

[
{ "tag": "const", "name": "BAR", "location": "/home/rpav/test.h:3:11", "type": { "tag": ":int" }, "value": 14 },
{ "tag": "struct", "name": "my_point", "id": 0, "location": "/home/rpav/test.h:5:16", "bit-size": 544, "bit-alignment": 32, "fields": [{ "tag": "field", "name": "x", "bit-offset": 0, "bit-size": 32, "bit-alignment": 32, "type": { "tag": ":int" } }, { "tag": "field", "name": "y", "bit-offset": 32, "bit-size": 32, "bit-alignment": 32, "type": { "tag": ":int" } }, { "tag": "field", "name": "odd_value", "bit-offset": 64, "bit-size": 480, "bit-alignment": 32, "type": { "tag": ":array", "type": { "tag": ":int" }, "size": 15 } }] },
{ "tag": "typedef", "name": "my_point_t", "location": "/home/rpav/test.h:9:3", "type": { "tag": ":struct", "name": "my_point", "id": 0 } },
{ "tag": "enum", "name": "some_values", "id": 0, "location": "/home/rpav/test.h:11:6", "fields": [{ "tag": "field", "name": "a_value", "value": 0 }, { "tag": "field", "name": "another_value", "value": 1 }, { "tag": "field", "name": "yet_another_value", "value": 2 }] },
{ "tag": "function", "name": "do_something", "location": "/home/rpav/test.h:17:6", "variadic": false, "parameters": [{ "tag": "parameter", "name": "p", "type": { "tag": ":pointer", "type": { "tag": "my_point_t" } } }, { "tag": "parameter", "name": "x", "type": { "tag": ":int" } }, { "tag": "parameter", "name": "y", "type": { "tag": ":int" } }], "return-type": { "tag": ":void" } }
]

ffigen's output. Modify a little to the struct definition for ANSI C, otherwise ffigen will complain "struct size is variant". Also a lot of (macro... ) lines are omitted here.

(macro ("test.h" 1) "FOO" "(1 << 2)")
(var ("test.h" 3)
 "BAR"
 (int ()) (static))
(struct ("" 0)
 "my_point"
 (("x" (field (int ()) 0 4))
  ("y" (field (int ()) 4 4))
  ("odd_value" (field (array 5 (int ())) 8 20))))
(type ("test.h" 9)
 "my_point_t"
 (struct-ref "my_point"))
(enum ("" 0)
 "some_values"(("a_value" 0)("another_value" 1)("yet_another_value" 2)))
(enum-ident ("" 0)
 "a_value" 0)
(enum-ident ("" 0)
 "another_value" 1)
(enum-ident ("" 0)
 "yet_another_value" 2)
(function ("test.h" 17)
 "do_something"
 (function
  ((pointer (typedef "my_point_t")) (int ()) (int ()) )
  (void ())) (extern))

For toplevel variable, struct, typedef, enum and function definition c2ffi contains enough information to build a ffi definition. The thing ffigen has but c2ffi doesn't is macro definitions, though c2ffi has a option -M to dump macro definitions to a separate file:

const long __c2ffi_FOO = FOO;

It doesn't really parse the macro definition, but this is a clever work around and let clang compile this snippet, then he knows the value of FOO and is able to convert it into a defconst . But to generate a ffigen style (macro ("test.h" 1) "FOO" "(1 << 2)") I need to patch c2ffi :-( I wonder how c2ffi will deal with macros like #define max(a,b) ((a)>(b)?(a):(b)), so I also try it. And c2ffi simply output nothing for it. ffigen will leave a raw (macro ...) line as expected. And I found c2ffi also need to update for each new version of clang. So based on your suggestions my final plan is:

  1. Maintain ffigen's patches for current gcc and maybe future gcc;
  2. Study libclang and c2ffi's src, build a slightly variant version that include raw macro lines and output in ffi format, and it's better to also utilize libclang's new feature on C preprocessor content.

ailisp avatar Dec 29 '17 22:12 ailisp

Thanks for that research. Your planned approach seems good.

xrme avatar Dec 30 '17 01:12 xrme

Hi @xrme. I made a little progress today. Also who is gb in the svn log? I would rebase and use his name in git. Thanks! Made some minor change on Makefile.in. Now it can build with recent gcc, but still need to download gcc-4.0.0 source (in gcc-4.0.0 branch). I tested building with gcc version 7.2.1 20171128 (GCC): https://github.com/ailisp/ffigen Also try to patch gcc-7.2.0 in gcc-7.2.0 branch. However, build unpatched gcc 7.2 took me ~2 hours so the progress is slow. If still no progress I'll study libclang and c2ffi and working on a new ffigen.

ailisp avatar Dec 31 '17 04:12 ailisp

@ailisp: gb is Gary Byers [email protected]. He doesn't have a GitHub id.

Don't feel pressured to get this done because I mentioned this issue from that FreeBSD 12 bug. I can always build an ffigen on an older system and copy it to a FreeBSD 12 system if I need to.

xrme avatar Dec 31 '17 04:12 xrme

@xrme Thanks. Recent progress: after read ffigen.c, I found its structure is a bit difficult to fit libclang. libclang is given you the AST and you walk on it but current ffigen.c is to patch and execute in the parsing step in gcc. To build a new version in libclang will be simpler than working on current ffigen.c. Sorry for this, though previous work from Gary, Helmut and others are quite helpful and I'll attribute most of contributions to them. Current libclang support on preprocessing information is still incomplete. As we know for empty .h file there's hundred of lines of #define __GNUC__ 4, #define __linux__ 1, etc. libclang can only get __GNUC__ but not 4, and filename for these macros are NULL. For macros in specific files, /usr/include/stdio.h or a foo.h it's not a problem. libclang can get start/end locations of this macro definition and I can manually read it from file. So I can get:

(macro ("test.h" 1) "FOO" "(1 << 2)") 

but not:

(macro ("" 1) "__GNUC__" ???)

??? is not accessible (because don't know where's file). After long attempt I feel ashamed that I can get these from clang -dM -E -x c /dev/null > predefined.h :-) Also, I'm delight to find that I can only produce raw visible macro lines and parse-ffi.lisp will take care of recursive replace, macro with arguments, parse and eval c expressions. It's really a great work.

ailisp avatar Jan 03 '18 07:01 ailisp

Progress report: finished macro, enum, reference a primitive types, part of reference a pointer type and define a variable of primitive type: https://github.com/ailisp/ffigen5 When I'm testing with various type of pointer type, found a very bad news about function pointer: If parse void (*f)(void);, original ffigen will produce

(var ("test.h" 32)
 "f"
 (pointer (function
  ()
  (void ()))) (static))

But for libclang, it can first recognize f is a pointer, then clang_getPointeeType of this type returns a CXType_Unexposed, which means this information (function prototype that f points to) is not export to libclang. Can only be accessed by clang's C++ library libTooling (which is also used by c2ffi). But in it's introduction: https://clang.llvm.org/docs/Tooling.html

Do not use LibTooling when you…: want a stable interface so you don’t need to change your code when the AST API changes

What else I can get from libclang is a raw string of f's type: void (*)(void) I'm thinking about 3 ways for this (all have some disadvantages):

  1. Isolate and wrap required C++ part in a separate small lib parallelled to libclang, need to update as clang update. Additional maintainance required for the future but more general, and it's possible there's other features needed only in LibTooling.
  2. Though libclang doesn't allow access to c++ pointer, it have access to function definition. Add a temporary line replace (*) to a internal name ___g1234_ so I'm able to produce something like: (function () (void ())). Also works if there's parameters. But if there's function pointer parameters, well, a little messed up.
  3. Ignore and simply treat it as a void * pointer. As I read in parse-ffi.lisp, doing this looks safe (maybe I lose something?) But I want to produce at least as complete as original ffigen and don't like this way. Any good idea about this? Thanks!

ailisp avatar Jan 08 '18 05:01 ailisp

Thank you, @ailisp, for investigating this.

I really want to use the stable libclang interface if we possibly can.

Let's try your approach number 3. The C ABIs don't distinguish between a function pointer and any other generic pointer. Writing something like:

(#_qsort :address base :size_t nel :size_t width :address comp)

where comp is defined via defcallback seems fine to me. There's no way anyone is going to write out the type of the comp function in CCL's FFI notation (even if the notation supports function pointers, which I'm not sure it even does).

xrme avatar Jan 09 '18 00:01 xrme

Thank you! Sounds good since it doesn't affect how we use such callback in lisp. I also prefer a stable interface.

ailisp avatar Jan 09 '18 04:01 ailisp

Please also file a bug against clang if you can, I think they’d want to know that this information isn’t exposed.

eschaton avatar Jan 10 '18 20:01 eschaton

@eschaton Hi, thanks and sorry for the late response. I was busy with a interview in San Francisco and just back home. I post a message in cfe-dev mail list: http://lists.llvm.org/pipermail/cfe-dev/2018-January/056566.html. Didn't hear replies though. @xrme I'm mostly done with reference a type. Having a problem for transparent union. Is transparent union means something like:

struct {
    int a;
    union {
        int b;
        float c;
    }
}
```
or gcc extension: `__attribute__((__transparent_union__))`?

ailisp avatar Jan 18 '18 16:01 ailisp

Today I finished almost all c part. Now lefting objc class and category. I have a question about function definition: in about line 460 of ffigen.c:

      /* struct ffi_typeinfo *arg_type_info; */
        /*
          It seems like functions that take a fixed number of arguments
          have a "void" argument after the fixed arguments, while those
          that take an indefinite number don't.
          That's perfectly sane, but the opposite of what LCC does.
          So, if there's a "void" argument at the end of the arglist,
          don't emit it; if there wasn't, emit one.
          Sheesh.
        */

But what I tested seems the opposite: given:

int af(int a, ...);
int bf(int a);

ffigen gives:

(function ("test.h" 62)
 "af"
 (function
  ((int ()) (void ()))
  (int ())) (extern))
(function ("test.h" 63)
 "bf"
 (function
  ((int ()) )
  (int ())) (extern))

Is this comment obsolete? I use the same behavior as ffigen gives.

ailisp avatar Jan 19 '18 23:01 ailisp

@ailisp I've had a chance to experiment with your code, and it looks very promising. It is so helpful that you figured out so much of the libclang API. Thank you very much.

I need to generate a new set of interface databases from FreeBSD 12 header files. I spent part of today hacking on and using (my private fork of) your libclang-based ffigen, and I think it's going to work. I'm planning to spend the next two days on this and see how far I get. Starting Thursday, I'll be away for two weeks and probably won't have a chance to do very much hacking on CCL, but I am hoping that two days will be enough time to get it done.

FreeBSD will be a good start because we won't have to worry about dealing with Objective-C.

xrme avatar Feb 20 '18 05:02 xrme

Hi @xrme, I'm so glad you found it useful. Not sure if you forked my most recent version, now it supports system include path and works with h-to-ffi.sh. I manually compare it's output for elf.h with ffigen4. And mostly looks good, except some macro definition becomes random latin1 code. I guess it's caused by encoding. Another issue is it's not aware of attribute((transparent_union)). (I couldn't find something in libclang to detect that) Neither does ffigen4. Maybe ffigen4/gcc4 use a different syntax for that? Thanks for continue working on it. I plan to add objc part after all c part works. So I'll try to do that after your upcoming days' work. Also sorry for the delay. These days I was busy with preparing and taking interviews, and just got my first job after graduation.

On Feb 20, 2018 12:57 AM, "R. Matthew Emerson" [email protected] wrote:

@ailisp https://github.com/ailisp I've had a chance to experiment with your code, and it looks very promising. It is so helpful that you figured out so much of the libclang API. Thank you very much.

I need to generate a new set of interface databases from FreeBSD 12 header files. I spent part of today hacking on and using (my private fork of) your libclang-based ffigen, and I think it's going to work. I'm planning to spend the next two days on this and see how far I get. Starting Thursday, I'll be away for two weeks and probably won't have a chance to do very much hacking on CCL, but I am hoping that two days will be enough time to get it done.

FreeBSD will be a good start because we won't have to worry about dealing with Objective-C.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Clozure/ccl/issues/13#issuecomment-366874605, or mute the thread https://github.com/notifications/unsubscribe-auth/AMpSiIphMA1KbdWbiQjL1tMQOIOVEhSAks5tWl7UgaJpZM4MFNa- .

ailisp avatar Feb 20 '18 14:02 ailisp

Any further progress on this?

GOFAI avatar Sep 11 '18 05:09 GOFAI

Hi @GOFAI @xrme have further update in https://github.com/xrme/ffigen5, not sure is it fully working?

ailisp avatar Sep 11 '18 15:09 ailisp

I have gotten far enough with a new ffigen to be able to generate working headers for FreeBSD. I have been meaning to track down that code and check it in, but I haven't done that yet. I will try to do that soon.

xrme avatar Sep 11 '18 15:09 xrme

I'm particularly interested in generating interface files for the newer macOS frameworks like SceneKit. How complete is the ObjC functionality?

GOFAI avatar Sep 12 '18 02:09 GOFAI

Unfortunately i don't know much of obj-c so the obj-c part is not even started. Probably you'll want to look at https://github.com/rpav/cl-autowrap and https://github.com/rpav/c2ffi and the ffigen4 if it works.

On Tue, Sep 11, 2018 at 10:42 PM, Edward Geist [email protected] wrote:

I'm particularly interested in generating header files for the newer macOS frameworks like SceneKit. How complete is the ObjC functionality?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Clozure/ccl/issues/13#issuecomment-420493136, or mute the thread https://github.com/notifications/unsubscribe-auth/AMpSiMNrimpNwuMUT7OKglyJo9_9LF98ks5uaHSjgaJpZM4MFNa- .

ailisp avatar Sep 12 '18 20:09 ailisp

Has anyone gotten ffigen4 to compile on macOS using a recent XCode? The ObjC blocks version (ffigen-apple-gcc-5646/ffigen4) exits compilation on the following errors:

../../gcc-5646/gcc/toplev.c:564:1: error: redefinition of a 'extern inline'
      function 'floor_log2' is not supported in C99 mode
floor_log2 (unsigned HOST_WIDE_INT x)
^
../../gcc-5646/gcc/toplev.h:174:1: note: previous definition is here
floor_log2 (unsigned HOST_WIDE_INT x)
^
../../gcc-5646/gcc/toplev.c:599:1: error: redefinition of a 'extern inline'
      function 'exact_log2' is not supported in C99 mode
exact_log2 (unsigned HOST_WIDE_INT x)
^
../../gcc-5646/gcc/toplev.h:180:1: note: previous definition is here
exact_log2 (unsigned HOST_WIDE_INT x)
^

I'd try compiling it using the Homebrew formula that provides Apple's gcc 4.2.1-5666.3, but it only works on OS X 10.9 or older.

GOFAI avatar Sep 14 '18 07:09 GOFAI

Seems there’s a lot of error about “not supported in C99 mode”, what about trying clang -std=c89 or -std=gnu89 flag? https://clang.llvm.org/docs/UsersManual.html#differences-between-various-standard-modes

On Sep 14, 2018, at 3:25 AM, Edward Geist [email protected] wrote:

Has anyone gotten ffigen4 to compile on macOS using a recent XCode? The ObjC blocks version (ffigen-apple-gcc-5646/ffigen4) exits compilation on the following errors:

../../gcc-5646/gcc/toplev.c:564:1: error: redefinition of a 'extern inline' function 'floor_log2' is not supported in C99 mode floor_log2 (unsigned HOST_WIDE_INT x) ^ ../../gcc-5646/gcc/toplev.h:174:1: note: previous definition is here floor_log2 (unsigned HOST_WIDE_INT x) ^ ../../gcc-5646/gcc/toplev.c:599:1: error: redefinition of a 'extern inline' function 'exact_log2' is not supported in C99 mode exact_log2 (unsigned HOST_WIDE_INT x) ^ ../../gcc-5646/gcc/toplev.h:180:1: note: previous definition is here exact_log2 (unsigned HOST_WIDE_INT x) ^ I'd try compiling it using the Homebrew formula that provides Apple's gcc 4.2.1-5666.3, but it only works on OS X 10.9 or older.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Clozure/ccl/issues/13#issuecomment-421256684, or mute the thread https://github.com/notifications/unsubscribe-auth/AMpSiOSL39Yv4xPCkDtxs70i-4QF51yFks5ua1nSgaJpZM4MFNa-.

ailisp avatar Sep 14 '18 13:09 ailisp

I've managed to compile ffigen4 under macOS 10.13 using the gcc provided by the [email protected] brew formula. I'm not sure, however, whether the h-to-ffi.sh is broken or if I need to change something else in the populate.sh file to get it to work. Once I point it at the current SDK, it seems to always choke on the following (many previous lines omitted):

Need to create info for type:
 <real_type 0x1034ec370 NSTimeInterval sizes-gimplified DF
    size <integer_cst 0x141801d80 type <integer_type 0x1418130b0 bit_size_type> constant invariant 64>
    unit size <integer_cst 0x141801db0 type <integer_type 0x141813000 long unsigned int> constant invariant 8>
    align 64 symtab 8117 alias set -1 precision 64>
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks/Foundation.framework/Headers/NSDate.h:26: confused by earlier errors, bailing out

Any ideas about how to interpret this? I seem to get stuck on this same error both for newer frameworks like SceneKit as well as older ones like OpenGL that ffigen4 should obviously be able to handle.

GOFAI avatar Sep 16 '18 09:09 GOFAI

Newer SDKs include constructs in the header files that the old ffigen/gcc doesn't understand.

xrme avatar Sep 16 '18 15:09 xrme