llvm icon indicating copy to clipboard operation
llvm copied to clipboard

irutil: add stdlib package for C standard library declarations

Open mewmew opened this issue 4 years ago • 10 comments
trafficstars

As suggested in https://github.com/llir/llvm/pull/187#issuecomment-860148771:

I would say the biggest difficulty was the C standard library - it would be super cool to have a stdlib package, with bindings to the standard library. In the compiler/builtins.go file I wrote out the function signatures for a bit of the stdlib, but it would be awesome to automatically have bindings to the whole C stdlib.

Add a new irutil/stdlib package containing function (and global variable) declarations for interacting with the C standard library.

We should consider automatically doing this, perhaps using the LLVM compiler to parse the C standard library headers and generating LLVM IR function (and global variable) declarations.

Then, we could parse the LLVM IR output using llir/llvm/asm to get the llir/llvm/ir representation to interact with.

Will require some experimentation to find what approach works well, and is easy to work with.

Edit: related issues #22, #178, #180.

mewmew avatar Jun 13 '21 11:06 mewmew

@mewmew Would you like to create the issue in llir/irutil?

dannypsnl avatar Jun 13 '21 17:06 dannypsnl

Maybe we will also need llvm built-in variables/functions? C-API is a bit dangerous, they are added by linker, not always will be there.

dannypsnl avatar Jun 13 '21 17:06 dannypsnl

@mewmew Would you like to create the issue in llir/irutil?

Lets keep all issues in one tracker for now (in llir/llvm/issues). It's easier to get an overview of the entire llir project that way.

Maybe we will also need llvm built-in variables/functions? C-API is a bit dangerous, they are added by linker, not always will be there.

We can add two packages, irutil/stdlibc for C function declarations (e.g. printf) and irutil/instrinsic for official LLVM intrinsic functions (e..g @llvm.pow.f32).

mewmew avatar Jun 13 '21 17:06 mewmew

I have implemented some of the most used stdlibc functions at https://github.com/Nv7-GitHub/bpp/blob/ff2d32542a2b493cc5eaa7e75f349371d9e99111/old/compiler/builtins.go#L46 if that would be any help

Nv7-GitHub avatar Sep 30 '21 00:09 Nv7-GitHub

I have implemented some of the most used stdlibc functions at https://github.com/Nv7-GitHub/bpp/blob/ff2d32542a2b493cc5eaa7e75f349371d9e99111/old/compiler/builtins.go#L46 if that would be any help

Definitely helpful. I think we can evaluate a few different approaches before settling on the one to use. I would wish for us also to consider maintenance of the stdlibc as those headers are updated and e.g. new functions are added. It may be possible to automatically generate LLVM IR code by parsing the std header files. Or perhaps that would be a crazy idea. Still, worth investigating as that would help with maintenance of these as well.

Cheers, Robin

mewmew avatar Sep 30 '21 02:09 mewmew

Can clang produce LLVM from headers?

Nv7-GitHub avatar Sep 30 '21 05:09 Nv7-GitHub

Can clang produce LLVM from headers?

At least using a tiny dummy C file including the headers.

Input (dummy) C file:

#include <stdio.h>
#include <string.h>

void *foo1 = printf;
void *foo2 = memcpy;

Run:

clang -S -emit-llvm -o a.ll a.c

Output LLVM IR:

; ModuleID = 'a.c'
source_filename = "a.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

@foo1 = dso_local global i8* bitcast (i32 (i8*, ...)* @printf to i8*), align 8
@foo2 = dso_local global i8* bitcast (i8* (i8*, i8*, i64)* @memcpy to i8*), align 8

declare i32 @printf(i8*, ...) #0

; Function Attrs: nounwind
declare i8* @memcpy(i8*, i8*, i64) #1

attributes #0 = { "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { nounwind "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0, !1, !2}
!llvm.ident = !{!3}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 7, !"PIC Level", i32 2}
!2 = !{i32 7, !"PIE Level", i32 2}
!3 = !{!"clang version 12.0.1"}

The output LLVM IR contains the function declarations for memcpy and printf.

mewmew avatar Sep 30 '21 08:09 mewmew

Perhaps by getting a list of stdlib functions and then doing this for the stdlib we could get a list of function declarations, and we could parse them using this library and get the output

But how would the API look? A map of function name to function? A bunch of global variables with the functions? A function to get a stdlib fn by name?

Nv7-GitHub avatar Sep 30 '21 13:09 Nv7-GitHub

But how would the API look? A map of function name to function? A bunch of global variables with the functions? A function to get a stdlib fn by name?

Good question. I'm not quite sure that the best API would. How would you envision yourself wishing to use it @Nv7-GitHub? Perhaps that could help guide API design.

Cheers, Robin

mewmew avatar Sep 30 '21 13:09 mewmew

I'm pretty sure A map of name to function would be the best, we simply modify Module to do so.

dannypsnl avatar Sep 30 '21 17:09 dannypsnl