carp icon indicating copy to clipboard operation
carp copied to clipboard

Compiler for Carp

Open firebolt55439 opened this issue 10 years ago • 3 comments

I have taken a look at the CARP instruction set and the basic implementation. I consider myself to be proficient in C and C++, and am volunteering to create a compiler fork/branch.

Suggestions:

  1. Write the compiler in C++ using LLVM as a backend, allowing for maximum flexibility.
  2. Compile CARP to C and leverage existing C compilers.

A note: When compiling to another language, for example LLVM IR or C, due to the inherent nature of a stack VM, constant folding is trivial. At compile-time. The compiler will have to make tradeoffs between how many operations (e.g. addition, bitwise, etc.) it will do at compile time or how many it will relegate to the backend for code generation.

firebolt55439 avatar Aug 04 '14 18:08 firebolt55439

Hi firebolt55439,

Excellent, though I would prefer C to C++. Is there not an API for C?

Your note makes sense.

tekknolagi

On Mon, Aug 4, 2014 at 8:44 PM, firebolt55439 [email protected] wrote:

I have taken a look at the CARP instruction set and the basic implementation. I consider myself to be proficient in C and C++, and am volunteering to create a compiler fork/branch.

Suggestions:

  1. Write the compiler in C++ using LLVM as a backend, allowing for maximum flexibility.
  2. Compile CARP to C and leverage existing C compilers.

A note: When compiling to another language, for example LLVM IR or C, due to the inherent nature of a stack VM, constant folding is trivial. At compile-time. The compiler will have to make tradeoffs between how many operations (e.g. addition, bitwise, etc.) it will do at compile time or how many it will relegate to the backend for code generation.

— Reply to this email directly or view it on GitHub https://github.com/tekknolagi/carp/issues/12.

tekknolagi avatar Aug 04 '14 18:08 tekknolagi

Hi tekknolagi,

There is indeed an API for C, but there are drawbacks to using it - its intended purpose was to expose the API to other programming languages which have a hard time inter-oping with C++ (name mangling, namespace relocation, etc.), and find it easier to call C functions. Basically, it takes all of the C++ classes provided, and exposes to C a "reference" to them, or rather a data structure that is almost a serialization of the class, and can be converted to and from a C++ class (see below).

It provides C++ functions called 'wrap' and 'unwrap' which take a C-compatible reference and return a class, and vica versa. For a compiler for CARP, my recommendation would be to write a wrapper in C++ and write the rest in C. The wrapper should "abstract away" some certain details (e.g. llvm::BasicBlock*'s for labels, "stack-based" evaluation, etc.).

Another thing (or two): LLVM IR, the language which all LLVM code from front-ends (e.g. clang, clang++, etc.) gets compiled down to, is register-based. The problem arises with the question: "How do you write a program that takes a language describing a stack machine and translate that into instructions for a register machine?" Once again, the question is how much you want to make the compiler evaluate at compile-time or leave to run-time. LLVM does offer powerful constant folding and incredibly refined optimization passes to use, and they do things like peephole optimizations, strength reduction, and more. They should definitely be taken advantage of.

P.S. At the moment, CARP's instruction set can, for the most part, be done at compile-time. The only parts that "really" need to be compiled down to LLVM IR or such would be prints and registers (both of which are easy with LLVM).

So the compilation process would look something like this (in pseudo-code):

Input: push 1 push 2 add print top of stack

Compiler: 1st pass: push 3 print top of stack 2nd pass: (generates AST - though that may not be needed) 3rd pass: (goes to LLVM backend and generates code)

Output: call puts (or printf) with input: "3"

firebolt55439 avatar Aug 05 '14 18:08 firebolt55439

Perhaps, then, LLVM is not the project for this.

On Tue, Aug 5, 2014 at 11:27 AM, firebolt55439 [email protected] wrote:

Hi tekknolagi,

There is indeed an API for C, but there are drawbacks to using it - its intended purpose was to expose the API to other programming languages which have a hard time inter-oping with C++ (name mangling, namespace relocation, etc.), and find it easier to call C functions.

It provides C++ functions called 'wrap' and 'unwrap' which take a C-compatible reference and return a class, and vica versa. For a compiler for CARP, my recommendation would be to write a wrapper in C++ and write the rest in C. The wrapper should "abstract away" some certain details (e.g. llvm::BasicBlock*'s for labels, "stack-based" evaluation, etc.).

Another thing (or two): LLVM, formerly Low Level Virtual Machine, became an umbrella project - but it retained aspects of its previous incarnation. For example, LLVM IR, the language which all LLVM code from front-ends (e.g. clang, clang++, etc.) gets compiled down to, is register-based. The problem arises with the question: "How do you write a program that takes a language describing a stack machine and translate that into instructions for a register machine?" Once again, the question is how much you want to make the compiler evaluate at compile-time or leave to run-time. LLVM does offer powerful constant folding and incredibly refined optimization passes to use, and they do things like peephole optimizations, strength reduction, and more. They should definitely be taken advantage of.

— Reply to this email directly or view it on GitHub https://github.com/tekknolagi/carp/issues/12#issuecomment-51239016.

tekknolagi avatar Aug 05 '14 18:08 tekknolagi