elks
elks copied to clipboard
C compiler
Maybe I'm wrong (I just tried to install elks on an original PC IBM XT), but seems to me that no C compiler is included inside elks (I've found the basic language interpreter though), while I think that's one of the basic thing of every Linux system.
If that's the case, I understand that a mammoth GCC might not really fit the project, but maybe the small and powerful tcc might be an option!
Hello @stevexyz,
Thanks for your interest in ELKS. A few years back, the project switched from using bcc to ia16-elf-gcc to build the kernel and all the applications. One of the big reasons was the lack of support for 8086 segmented architecture linker and compiler options that are not present in bcc (or tcc). Unfortunately, gcc is way too large to be able to be included in ELKS runtime, so there isn't a way for ELKS to be self-compiled, but this has been viewed as a reasonable tradeoff, all things considered.
I'm a fan of tcc, but the question is, given the limitations of segmented-mode 8086, what purpose would it serve, doing the work to get a self-hosted compiler running, when none available contain the features required to build the current kernel and all of the applications?
Thank you!
To me a C compiler is the base of a complete Linux (Unix) system... Don't know what are the current priority of the project but seems to me this one should be relevant. And if tcc is not ready or worst suitable maybe there are other options: back in the day there were many compilers, and all run on the basic x86 segmented architecture. Just willing to add that was amazing to see Linux booting in the XT, this addition is for sure keeping the bar very high! :) For now compliments to all developers!!
Hi @stevexyz - let me add my 2 cents to this discussion.
To me a C compiler is the base of a complete Linux (Unix) system...
Here's the the thing: ELKS is not a complete Linux or Unix system. Like the name says, it's intended for embedded systems. Embedded systems have limited resources and are rarely if ever selfcompiling. Of course our vintage PCs aren't embedded systems, but they have very limited resources. And - as @ghaerr also alluded to - it just doesn't make sense to have that ambition. Possibly fun, but not useful.
Think about it - what we have today is a cross development environment with gcc
and tools that today's developers are familiar with and expect. And a decent cross host you can grep
the entire source tree for something in a couple of seconds. Try to do something similar on your XT - you'd probably have to come back the next day.
I have Venix running on one of my machines, a 286/12. It's a complete Unix system. It can compile itself if I had full sources, and I've done a lot of development on it. In the mid 80s and recently. It has make
, cc
, sccs
and the most basic (early) PWB tools (basically a V7 system - anno ca. 1980). And there are no cross tools, so everything is local. I can assure you, it's not a development environment you'd want to live in. It's extremely slow - and I keep asking myself how I could possibly stay sane while working in this environment in the 80s. The truth is that expectations were different then. Having your own 'complete' Unix system just qualified the endless waits - and coffee breaks :-) .
And if tcc is not ready or worst suitable maybe there are other options: back in the day there were many compilers, and all run on the basic x86 segmented architecture.
Yes, the compilers you're referring to RUN on the segmented architecture, but they support only parts of it – the small, maybe medium memory model, that's all. They have very limited options and support-tools (like cpp
supporting only a few of your favorite #preprocessor commands, no objdump
etc etc). Try porting any modern piece of open source code to such a system and you'll start pulling your hair out right away. I spent a lot of time trying to get rz/sz
running on Venix a while back - it ended up being a BIG project because of the limitations in the compiler - not to mention in make
. And because many of the source files were too big for vi
.
Just willing to add that was amazing to see Linux booting in the XT, this addition is for sure keeping the bar very high! :) For now compliments to all developers!!
ELKS has come a long way, the last few years in particular. Your contributions would be very welcome. Even a native C-compiler. It's your time and your choice - and you'll get plenty support from the group regardless of whether the target tool/application is for the few or the many.
--M
Not for self-compiling but to make small debug program, it is nice to have a small compiler on ELKS. I now uses the basic to peek memory or read ports on the real PC from the background but sometimes wants to do a little more complicated.
As already mentioned by @tyama501 it was not to be used to self compiling, even if it would have been a nice thing. And especially for starting, if there is something that is ready to be used, doesn't really matter if it is supporting just a limited memory model, but at least you can compile and run some programs on the system without always access another computer.
For now I've other (unfortunately too many) projects going on and I'll stay on the window looking the ways ELKS will grow up, but in the future if it will still be not developed maybe I'll give it a try!
In the meantime keep up the good work and happy hacking!
I thought more deeply about exactly what is entailed when someone says "I'd like a C compiler" to run native on ELKS. As @tyama501 and @stevexyz mentioned, it would be nice to be able to at least just compile some programs from within ELKS.
In order to do that, we'd need the following:
- Compilation of the ELKS C library into .o (object) and .a (archive) formats. The host-based .o and .a formats are incompatible, as they're oriented around the host
ia16-elf-gcc
toolchain. That means we'd need to compile all the C library on ELKS itself (not cross-compiled, unless the target ELKS compiler also runs on Linux and macOS), and that would have to be done on a VM running ELKS, in order to be kept current. - We'll need
make
on the target, as well as any commands used in the C library Makefiles. - We'll need easy ways to make (large) target disks that contain all the library sources, for compilation. We've pretty much got that, but need scripts to make it happen.
- The C library sources would need to be ported to the new C compiler, as will all .s/.S (assembly language) source files. Although most of the library is portable, any uses of
far
pointers,asm()
directives, or incompatible gcc preprocessor directives need changing. Of course, any of these files could be left out, but that might inhibit desired C library routines. - All the header files would need to be copied to the target (development) filesystem. I have not checked whether they'd all fit on a floppy, or whether the ELKS C compiler would require an HD. Most of our users now use ELKS on floppy, and we don't yet have an easy way to update hard drive images, but we're working on it.
- The C compiler chosen will have to be modified to output the special ELKS a.out file format, (which is almost but not quite MINIX v1). This will likely take some work, after porting the C compiler, preprocessor, assembler, linker and front-end to ELKS. None of the tools can use any more memory than 64K to compile any programs. The
ia16-elf-gcc
compiler actually produces Linux ELF-format output, and uses a seperateelf2elks
executable to convert the ELF format to a.out format. We could use the same method, but that program has to be written from scratch, not just ported.
After all this, there are all the issues that @Mellvik brings up, which include problems associated with having no objdump
, a different ASM language, possibly problems with vi
or other editors not being able to edit a file in limited memory, and slow compilation times.
All in all - I have agree with @Mellvik that such a project is not really what people think of "having a C compiler" for ELKS.
On another note, I was thinking about some C interpreters that might be able to provide fast execution of simple C programs, such as the C in 4 functions compiler. It is very cool with a small code size, and allows for calling out of various functions like printf
into the host-based C library. However, it is definitely not clear whether such a thing would run on ELKS. For instance, C4 uses 5 allocations of 256K each (hugely over our 64K data limit) by default. I might take a pass at allocating much smaller sizes for the source and produced symbol tables, object files and data, but I imagine this will likely greatly limit what C files can be compiled, and we still won't have the normal ELKS C library (nor multiple file compilation nor any linking).
Thank you!
Hello @stevexyz, @ghaerr, @tyama501,
I suspect that the Amsterdam Compiler Kit might be a good candidate for an ELKS-hosted C compiler, though I have not really got around to working on such a thing, and it probably needs a fair amount of effort. I believe ACK used to be the standard toolchain for Minix — including Minix/8086 — and besides, it is written to be able to run on small systems.
Thank you!
Seems to me that https://github.com/alexfru/SmallerC would be a very good start: seems easy enough and producing already 16 bit x86 code in various models, and with self compilation the ported compiler if it will produce the binary elk file. Maybe the author itself would adapt it if requested and specification of the binary file are given: if it is considered good we can try to ask.
PS: @ghaerr I had a look at c in 4 functions, and while being an amazing exercise of minimization, seems really not easy to port to minimal memory systems for the way it has been designed
just as comment, I tried to play with old "ACK for Minix" from https://web.archive.org/web/20070910201015/http://www.laurasia.com.au/ack/index.html#download on Minix i86 (not i386) qemu VM.
Well, it ran out of memory :) trying to compile itself under existing 'cc' compiler there.
Also, Portable C Compiler website seems to be down (and web archive does not have latest copy) so here I found slightly updated (2021) copy of code
https://github.com/matijaskala/pcc but originally it was at https://github.com/IanHarvey/pcc
There seems to be some code related by i86 generation by Alan Cox.
Also, someone (Eric J. Korpela) looked at lcc-8086 but not get very far https://setiathome.berkeley.edu/~korpela/
Hello @Randrianasulu and @tkchia,
Thank you @Randrianasulu for the links to PCC, I'll take a look at it. Same for LCC-8086, that work looks extremely old but could be worthwhile. Of course, it would probably be a good idea to consider only using ANSI-capable (vs K&R) compilers, given where most C code is at today.
I tried to play with old "ACK for Minix" from https://web.archive.org/web/20070910201015/http://www.laurasia.com.au/ack/index.html#download on Minix i86 (not i386) qemu VM.
Well, it ran out of memory :) trying to compile itself under existing 'cc' compiler there.
It's probably not needed that the compiler be able to compile itself under ELKS (or MINIX), so that's OK. I am not familiar to what degree ACK has been updated to any ANSI standards, and/or long/long log/float support etc.
In the case of running on ELKS, we now have the issue that some portions of the C library may be using some ia16-elf-gcc
-specific features. This would have to be looked into.
@tkchia, you had mentioned you're possibly somewhat familiar with ACK, would that be a version similar to that used for MINIX as described above, or has there been more work done updating it, to your knowledge?
Thank you!
@ghaerr I found little something supposed to help with backtranslating ANSI C to older dialect: https://github.com/udo-munk/unproto
Also, may be Xenix (286) a.out variant can be used to get some idea how multiple segments were supported.
https://ibcs-us.sourceforge.io/ look for sources - x.out
Hello @ghaerr,
@tkchia, you had mentioned you're possibly somewhat familiar with ACK, would that be a version similar to that used for MINIX as described above, or has there been more work done updating it, to your knowledge?
I have not yet done a comparison of the "laurasia" copy of ACK, and David Given's current ACK tree — I hope to do that soon.
At the moment I am more familiar with Mr. Given's source tree (since I have been working on it a bit). Some impressions:
- It is an extremely bog-standard C89 compiler.
- By that I mean, it has close to no syntactical extensions whatsoever. Inline assembly is simply not a thing. And, there is a
#pragma
directive, but there are no pragmas (yet). - Some machine targets support
long long
, but not all. In particular the 8086 back-end does not havelong long
support yet. - There are also vestiges of a K&R compiler, but it is most likely dead code.
- By that I mean, it has close to no syntactical extensions whatsoever. Inline assembly is simply not a thing. And, there is a
- The runtime libraries do implement some POSIX functionality in addition to the C89 stuff though. Plus of course we can add our own extension libraries.
- In terms of 8086 support:
- The compiler front-end can currently produce 8086 PC booter programs (
-mpc86
) and MS-DOS.com
files (-mmsdos86
). Unfortunately for some reason it no longer supports MINIXa.out
output — I guess MINIX support fell by the wayside while the code was updated (?). - There is an out-of-line assembler which mostly works. It can use some improvement (https://github.com/davidgiven/ack/issues/271).
- Currently 8086 floating-point support assumes that there is an 8087. (I believe that for some other target platforms, e.g. CP/M for 8080, it can actually do software floating point.)
- The compiler front-end can currently produce 8086 PC booter programs (
Thank you!
slightly newer ackpack for minix (1.1.2) https://web.archive.org/web/20060405092432/http://packages.minix3.org/software/ackpack.tar.bz2
weirdly it comes as tar.tar. I only get file by downloading it via browser, not via wget. Same source should still be in Minix3 git, but a bit obscured because it was deleted years ago ...
info from https://sourceforge.net/p/tack/mailman/message/36608143/
so, there was another compiler (c86 ?) but license prohibit commercial use.
https://github.com/plusk01/8086-toolchain/tree/master/compiler
https://web.archive.org/web/20150908032106/http://homepage.ntlworld.com/itimpi/compsrc.htm - so it was named c68, too ...
ah, it was not complete compiler, just c to asm (nasm in this version) compiler. It needed cc (main driver), ld86 (linker), c preprocessor (it seems for Psion 3 they tried Decus cpp, available in X11R3 distribution - not tried to build it yet). So, some sources are newer in C68 (for QDOS - mk68k/ Sinchlair QL system) but part of older EPOC sources still live at older site:
http://web.archive.org/web/20010414060410/http://www.itimpi.freeserve.co.uk/cpocdown.htm#SOURCE Only ld86 and cc survived ... and even then, this ld86 outputs by default Psion's IMG format. Still, something!
https://web.archive.org/web/20150908032106/http://homepage.ntlworld.com/itimpi/compsrc.htm - so it was named c68, too ...
Seems has already a lot of options, among them the ones for 8086 specific:
...
INTEL 8086 OPTIONS
The options listed in this section apply when generating
16-bit code for use on Intel processors. They will only
be available if support for the Intel 8086 processor type
was specified at the time the compiler was built.
-fpu=yes|no
Specify whether operations involving floating
point variables should generate in-line calls to
the a hardware floating point unit, or whether
calls are made instead to library support
routines. Using library support routines allows
floating point operations to be carried out
purely in software.
Default: -fpu=yes
-pointer=16|32
Specifies that the code should be generated to
conform to the small memory model (64K data +
64K code segments) which uses 16 bit pointers or
the large model which uses 32 bit pointers.
Default: -pointer=16
If support for multiple processors and/or assemblers was
configured when the compiler was built, then you can
specify the target to be a 8086 processor and a specific
assembler can be specified using the following options:
-bas86 Generate 8086 code. Use the syntax for Bruce
Evan's 16 bit 8086 assembler for the output.
-gas86 Generate 8086 code. Use the GNU assembler
syntax for the output.
-masm86 Generate 8086 code. Use the Microsoft MASM
assembler syntax for the output
-sysv86 Generate 8086 code. Use the Unix SVR4 assembler
syntax for the output.
...
so, I have something horribly broken, but it makes .o files!
https://github.com/Randrianasulu/c86
make on linux/termux should give you some binaries.
@Randrianasulu : it is almost certainly still not GPL-compatible though.
https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=6f433c5ba2f3dacbf1b3f2859197505dc440b4d9
seems to be very detailed document about c68 by author (I tried to send email to him, but no idea if old email address still works)
https://qlforum.co.uk/viewtopic.php?t=2112 - may be he has new email, forwarded to it too
@Randrianasulu : it is almost certainly still not GPL-compatible though.
just at the beginning of the C68 QL manual it says that it is Public Domain (even with capitals):
INTRODUCTION The C68 Compilation System provides a Public Domain C compiler for use under the QDOS operating system.
Hello @stevexyz,
I mentioned this because @Randrianasulu stated that the source files themselves seem to prohibit commercial use. And I see that system.c
states both "All commercial rights reserved" and "The C68 compiler forms part of a complete Public Domain development system for all these [QDOS] environments" (in the same file). So the licensing status of the C68 source code seems to me to be murky at best.
Thank you!
little aside (feel free to hide) but MAME got Psion 3 emulation inmore working state lately
https://forums.bannister.org/ubbthreads.php?ubb=showflat&Number=121869&page=3 (not very big fan of 2gb gcc process during compilation but I probably can let it happen onse in year or so)
also faucc (286 & 386 codegen only?)
https://gitlab.cs.fau.de/faumachine/faucc/-/commits/master (not really changed since 2012 ..?)
EDIT: sadly it does not compile faumachine's new bios :( also, no FP. so, not very useful?
ah, it was not complete compiler, just c to asm (nasm in this version) compiler.
Just a sidenote, but I think NASM does support ELKS a.out format. LCC port might use NASM too.
Anyways, last time I looked into ELKS binary format it had some limits on DATA and CODE segment sizes similar to Minix 1. Is that still the same? What are the limits?
Is it now possible to make something like large memory model executables in MS-DOS with GCC-iA16?
@bocke:
I think NASM does support ELKS a.out format. LCC port might use NASM too.
I'm not sure about whether NASM supports a 16-bit MINIX a.out format or not. Does NASM support ELF output? If so, the binary could likely be converted to ELKS a.out format using our own elf2elks
tool, which is also used to post-process ia16-elf-gcc
ELF output for ELKS.
The ELKS toolchain and kernel currently offer the ability to create and run small (64K code, 64K data) and medium (128K code, 64k data) model programs. Access to a larger data segment is possible through C __far
pointers and using the fmemalloc
system call to allocate main system memory, but that's all got to be done manually by the programmer. Thus, large model (> 64K data) isn't explicitly supported, but can be done with some effort.
I took a peek at NASM. It doesn't support Minix/ELKS directly. It supports as86 obj files though. They could be then linked with ld86 into elks binary, I think. My memory is a bit murky on this.
Thanx for the info on memory model support ghaerr. Much appreciated.
One option for a compiler that will definitely fit into 64K/64K would be the one I've been working on for Fuzix. It's mostly ANSI and it'll run in 48K (total). I've not tackled x86 yet but it's designed so it's very easy to get working but not pretty code out of it, and then to be able to optimize the backend once you have it working. Right now it's doing Z80, 8080, 8085 solidly, passable Z8 and Super8, close to doing 65C816 and 6809 usefully, and with other targets being worked on.
In 48K it can build startrek.c as one file on an 8080 so on a 64/64K system it should be pretty much unlimited what you can build natively. It can also build itself, though not optimized yet (copt doesn't fit in 48K and I need to write a replacement). copt should fit in 64K/64K.
https://github.com/EtchedPixels/Fuzix-Compiler-Kit