Carp icon indicating copy to clipboard operation
Carp copied to clipboard

Improve the Language Guide

Open eriksvedang opened this issue 4 years ago • 13 comments

  • [ ] Needs an index at the top
  • [ ] A more gentle intro on how to call and define functions would be good, early on the page
  • [ ] More links to sub pages with more detailed information (like the Macros doc, etc)
  • [ ] Some sections are very sparse compared to the nice descriptions in other headings
  • [ ] Lambdas are barely mentioned

eriksvedang avatar Nov 28 '20 12:11 eriksvedang

Here are some things I miss as a newbie:

  • a "Hello Carp" either in Manual.md or in Install.md to test that the installation is working
  • a short example using (fmt) and (fstr)
  • for each standard library (in "Core Docs") a short summary what it is about (e.g. why/when to use an Array vs. StaticArray ("stack allocated arrays... are called StaticArray") )
  • a polite note saying: "Do not be afraid of the type signatures, there're really usefull and not that hard." (e.g. if you try to filter an Array with a simple function, the parameter will be borrowed)
  • a polite nugde to go read the documentation. e.g. point out in LanguageGuide.md that (info) does not only work for dynamic functions but for libraries & functions as well.
  • adjust example for structs using a string (e.g. (deftype Person [name string, age Int]
  • state that the "void" type is: () (e.g. when describing function signatures)
  • a short statement WHERE definitions can be put (e.g. I found no way to nest one defn inside another defn)
  • a short note which forms return a value or (if that is easier) list forms returning () (e.g. set!)
  • a brief note that (and) and (or) "Special forms during evaluation of dynamic code" take only 2 arguments (e.g. in contrast to Scheme) (and refer to the documenation of the Bool library for the same reason)
  • expand explanation on "named holes" or point to implementation (if that should be self explanatory) e.g. does is always have to be "?woot" or what else can be used as a named hole?
  • affirm that in order to access the value behind a reference you need to copy it (e.g. use the @ reader macro)
  • a simpler way to use Carp on Windows (for me Zig does the trick and was pretty easy to set up)
  • a complete list of reader macros (e.g. quasiquoting is missing even from description of macros)
  • a list of builtins (e.g. what is not included from carp files)
  • an example of how to change the REPL prompt
  • give an example of where to put profile.carp on Linux (for occasional Linux users like myself)
  • describe the order in which modules are loaded (thereby answering which forms can be used in profile.carp)
  • an example how to process the "single file with C code". One use case might be to use gcc which insists on having the command line argument for libraries AFTER to source file (to be compiled). This is an issue since we alway need to include libm for the trigonometric functions.

edit - stuff that needs to be documented:

  • there's a macro to ignore return values (e.g. in a (do) (ignore ...)
  • explain print*, println* and build-str macros (e.g. in HelloWorld.carp)
  • from (Main) we can either return () which means an exit code of 0, or explicitly return an exit code as an Int.
  • if you get => in front of the result its output from a dynamic function (e.g. Dynamic.String.slice)
  • advocate TCC to speed up the REPL

guberathome avatar May 26 '21 11:05 guberathome

@guberathome Thanks a lot for this list – great stuff!

eriksvedang avatar May 26 '21 11:05 eriksvedang

My carp installation on Windows 10 setting was set up pretty quickly:

  1. Downloaded Zig for Windows (version 0.8.0) and extracted it to: D:\Tools\p\zig
  2. Downloaded Carp (version 0.5.0) and extracted it to D:\Tools\p\carp
  3. created a bath file which starts a command shell with the required settings
SET PATH=D:\Tools\p\carp\bin;D:\Tools\p\zig;%PATH%
SET CARP_DIR=D:\Tools\p\carp
cmd
  1. started the shell and verified that Zig is working as a C compiler
// content of hello.c
#include <stdio.h>
int main()
{
	printf("Hello Windows from Zig :-)\n");
	return 0;
}
zig cc --target=x86_64-windows-gnu -o hello.exe hello.c
hello.exe
  1. created %USERPROFILE%\AppData\Roaming\carp\profile.carp
(Project.config "compiler" "zig cc --target=x86_64-windows-gnu")
(Project.config "title" "Untitled.exe")
  1. verified that Carp is working
; content of hello.carp
(defn main []
	(IO.println &(fmt "Hello there: it took us only %d iterations to approximate e=%f" 123 2.71828 )) )
carp -x hello.carp

Note: I am aware of how to change environment variables globally or per user, but that was intentionally avoided.

Note: This should be a nice setup to try out Carp with minimal investment (e.g. no global changes to System, only <250MBytes on disk). If you need functionality from external dll files it might be easier to use the "standard solution" (e.g. Clang + MS linker).

guberathome avatar May 26 '21 12:05 guberathome

Carp 0.5.0 seems to work fine with GCC under Windows.

The command file to start the shell looks like:

SET CARP_DIR=D:\Tools\p\carp
SET GCC_DIR=D:\Tools\p\nim\mingw\bin
SET PATH=%CARP_DIR%\bin;%GCC_DIR%;%PATH%

Obviously profile.carp needs to be changed to:

(Project.config "compiler" "gcc")
; we no longer need (Project.config "title" "Untitled.exe")
;    since gcc outputs an Untitled.exe when supplied with "-o out\Untitled" option for MinGw builds (apparently)

So we don't need to link against libm (or libmath) like we do on Linux.

The GCC version used was taken from a Nim installation (which seems to install a MinGw build):

gcc (x86_64-win32-seh-rev1, Built by MinGW-W64 project) 6.3.0

At this point you are probably not interested in the fact that a regular build produced an executable, which omits part of the output, whereas everything's fine if the --optimize option is supplied to carp (which translates to -O3 for the gcc command)...

guberathome avatar May 26 '21 17:05 guberathome

Here are some points that I'd like to document. Comments are welcome.

open points

  1. give example of working in the REPL
  • e.g. read content from a file and test what Pattern.split returns
  1. IO.carp
  • explain difference between
    • (read-file): e.g. read entire content of a file ("That's probably what you want...")
    • (read->EOF): e.g. read until EOF char which won't work for binary files (does it work for texts cross-platform?)
  • (fread)
    • read into "buffer of (what?)" not into a "pointer"!
    • give a short example (either here or in Carp implementation). Or migrate (IO.read-file) from C to Carp
    • if possible implement (write-file)!
  1. Pointer.carp
  • (unsafe-alloc): point out how to safely allocate memory in Carp (since Pointer.alloc does not exist!)
  1. explain how to use the value behind a reference without copying it (read that in the sources somewhere)

explanations about to be documented

(none so far)

guberathome avatar May 30 '21 06:05 guberathome

@guberathome Looks like good things to clarify. Did you intend to extend the docstrings for those functions then?

eriksvedang avatar May 31 '21 08:05 eriksvedang

That would probably be a good place to put it. Is there a suspected limit of how long they may get?

guberathome avatar May 31 '21 16:05 guberathome

  • updated:
    • fiddeling with compiler options shows signifficant effect :-/
    • include size of running process to indicate size of linked shared libraries: fs=size of file, rp=size of running process in memory

There are 2 good reasons (apart from installation size and (really) easy cross-compilation) to support compilation of Carp with ZIG: executable size and ease of static linking against MUSL:

file sizes on Windows (shared libraries as listed by 'dumpbin /dependants`):

  • fs=323k: main.c
  • fs=203k, rp=384k: test2-clang.exe (KERNEL32.dll)
  • fs=203k, rp=384k: test2-msvc.exe (KERNEL32.dll)
  • fs=046k, rp=440k: test2-zig-glibc.exe (KERNEL32.dll, msvcrt.dll)
  • fs=038k, rp=640k: test2-gcc.exe (KERNEL32.dll, msvcrt.dll, USER32.dll)

file sizes on Linux after using strip command (shared libraries as listed by ldd):

  • fs=323k: main.c (smaller on Linux due to line breaks)
  • fs=036k, rp=004k: test2-zig-musl.exe () => none!
  • fs=014k, rp=582k: test2-gcc.exe (libc.so.6, libm.so.6, linux-vdso.so.1, /lib64/ld-linux-x86-64...)
  • fs=009k, rp=848k: test2-clang.exe (libc.so.6, libm.so.6, linux-vdso.so.1, /lib64/ld-linux-x86-64...)
  • fs=009k, rp=936: test2-zig-glibc.exe (libc.so.6, libdl.so.2, libm.so.6, libpthread.so.0, librt.so.1, libutil.so.1, linux-vdso.so.1, /lib64/ld-linux-x86-64...)

The compilation speed however seems significantly faster with gcc (including all its quirks). Runtime performance has to be determined but so far seems on par with gcc/Clang.

The source for above test was simply this:

(Project.config "title" "test2.exe")
(Project.config "compiler" "zig cc --target=x86_64-linux-musl") ; or "zig cc --target=x86_64-linux-gnu") ; or...
; C compiler flags were adjusted per compiler for minumum file size, see below
(Project.config "echo-compiler-cmd" true)

(defn user-input [prompt]
  (do
    (IO.print prompt)
    (IO.fflush NULL)  ; needed to flush STDOUT on MUSL
    (IO.get-char) ))

; copy file content from args[1] to args[2]
(defn main []
  (let [ dummy (user-input "press <ENTER> to start...")
         args  (Int.- (StaticArray.length &System.args) 1) ]    ; 1st argument is executable!
    (if (not (Int.= 2 args))
      (IO.errorln &(fmt "expecting exactly 2 arguments (got %d):\n\ttest2 input-file output-file" args))
      (let [  inFile  (StaticArray.unsafe-nth &System.args 1)
              outFile (StaticArray.unsafe-nth &System.args 2)
              sInput? (NewIO.safe-read-file inFile)  ]
        (if (Result.error? &sInput?)
          (IO.errorln &(fmt "error='%s' reading input from file='%s'" &(Result.unsafe-from-error sInput?) outFile))
          (let [  sInput  &(Result.unsafe-from-success sInput?)
              written (NewIO.write-file sInput outFile) ]
            (if (Result.error? &written)
              (IO.errorln &(fmt "error='%s' writing output to file='%s'" &(Result.unsafe-from-error written) outFile))
              (IO.println &(fmt "copied '%s' to '%s'" inFile outFile))  )))))))

compiler options:

  • clang: (Project.config "cflag" "-Os -flto -fuse-ld=lld") => on Linux I had to change -Os to -O2
  • MSVC: (Project.config "cflag" "/O1") ; Windows only!
  • zig+glibc: (Project.config "cflag" "-Os") ; note: adding -flto or -s has no effect
  • gcc: (Project.config "cflag" "-Os -flto -s")

guberathome avatar Jun 03 '21 21:06 guberathome

It would be great if these numbers were compared to those of Clang, since that is the compiler currently suggested in the docs.

eriksvedang avatar Jun 04 '21 00:06 eriksvedang

I would repeat the file size tests using LTO, as suggested in https://github.com/carp-lang/Carp/issues/922

jacereda avatar Jun 04 '21 13:06 jacereda

@jacereda: thank's for the pointer, the numbers for Windows have been updated (edit: Linux was updated too). :-)

This is where things became messy:

  • clang
    • needs its own linker for lto: -fuse-ld=lld
    • but these parameter do not work on Ubuntu with llvm installed by 'apt install...'
    • on Windows requires the libC from the MS compiler thus increasing disk space
  • gcc: provides an easy option to 'strip' symbols, no idea how to do that with msvc/clang
  • msvc produces the smallest binary with -O1 (and complained that -o is deprecated)
  • zig seems totally unaffected by the additional compiler options

guberathome avatar Jun 05 '21 14:06 guberathome

@eriksvedang:

Edit: The test is done now. Here are my key take aways:

  1. file size can be deceptive, depending on which functionality from shared libs is linked in. So it is not an indication for the (in)efficiency of a particular compiler.
  2. minimizing file size is possible if
    • you are giving the linker better data to work on. For that the compiler + linker must match, e.g. using Clang + Gnu ld will not produce the best results.
    • you optimize the compiler and linker flags for your C compiler on your plaform.
    • You strip the resulting executable (strip command on Linux)
  3. Minimizing file size by linking to shared libraries has a price which is payed as soon as the running process loads all the libraries, effectively canceling the reduced executable size.
  4. So what counts is the size of the running process. The biggest effect on it seems to be linking statically which includes only the required functions from the standard library.
    • The only setup which supports this out of the box was Zig+Musl (the latter seems to be explicitly intended for static linking).
    • Anyone who could fetch the corresponding options for Clang/Gcc from the nightmare of documenation (or possibly from memory) is welcome to provide them as a comment, so they can be added to the documentation
  5. The entire discussion should be most relevant under severely limited conditions. So my preferred place to put this information would be Embedded.md. It would certainly be beneficial put a hint at the "-s" option into it.

guberathome avatar Jun 06 '21 06:06 guberathome

I believe it would be beneficial to outline, when Carps strenghts can be utilizes and when other options might be preferred. Not quite sure where to put this yet.

Here's my take, please discuss:

Like every programming language Carp has its strenghts and weaknesses. Using Carp makes a lot of sense if and when:

  • you prefer runtime performance over code flexibility, so static/inferred typing is desirable.
  • you want/need a low memory footprint, so dynamic typing will appeal less.
  • you want the comfort of a garbage collector but also deterministic behaviour at runtime. Using and talking to the borrow checker will then pay off.
  • you want to use functionality from an external C library without having to actually program in C or to resort to advanced concepts (like Erlangs ports/port drivers).
  • you want to experiment and build programs up from existing libraries, in short you want a REPL.
  • you need macros, e.g. in order to alter how expressions are evaluated (unit tests being a prominent example).
  • your programs exist in a restricted environment, where huge VMs (Java, .NET) and their comprehensive libraries are not an option.
  • the time it takes to write the program is less than the time it takes to run it. For scripting you might prefer another option (like CLisp, Clojure, Python, ...).

Please note that i omitted the following points on purpose, which might be more appropriately discussed elsewhere (e.g. in CInterop.md):

  1. Like port drivers Carp favours speed over stability: it's perfectly possible to crash your Carp application with a buggy C lib.
  2. Unlike ChezScheme/Python, which can portably load shared libraries, Carp targets source compatibility thus avoiding some hairy issues with binary compatibility. This is in the best Unix tradition, however it will require sources for all external libraries.

Also I don't quite know how to phrase the following without scaring away potential Carp programmers:

  • you are happy to shape a rapidly developing language to your liking. If you really need a rock solid system (possibly battle tested by major projects), you might be happier elsewhere (Erlang/Elixir being prominent examples) until we've reached v1.0.

guberathome avatar Jun 22 '21 08:06 guberathome