OrangeC icon indicating copy to clipboard operation
OrangeC copied to clipboard

PowerPC architecture

Open neosaldina opened this issue 5 years ago • 24 comments

First of all, my congratulations for the wonderful work !! I've been checking your compilers, both OrangeC and cc386, and I wondered how complex is to implement support for ppc64 and ppc64el ?

neosaldina avatar Dec 16 '19 01:12 neosaldina

It would be highly complex as the entire infrastructure support currently is done for x86, if you're willing to put in the work to help create the thing in the first place and help genercize the compiler in the split branch, I'd (personally) be more than happy to help out with that, but adding support for an entirely new arch like that will not be easy and will require quite a bit of work.

chuggafan avatar Dec 16 '19 04:12 chuggafan

If I see it correctly this would mean:

  • add option for cross-compilation (which is useful in any case, the default would be the "native" codegen)
  • allow the last codegen phase to compile to ppc64 / ppc64el
  • translate all runtime libraries for ppc64 / ppc64el
  • test

Correct? Is there any reasonable paper which details the necessary adjustments for x86->ppc64? Would it be reasonabe to add x64 beforehand?

GitMensch avatar Dec 16 '19 08:12 GitMensch

@neosaldina thanks for the question technically speaking a port isn't too hard other than the runtime library, I'm going to elaborate on that statement quite a bit in my answers to the others.

@gitmensch pretty much so, although the cross compilation option now exists in the 'split' branch (we needed it to tell the difference between MSIL/x86).

@chuggafan actually occ is mostly friendly to porting these days. At least at the architecture level. Even more so than CC386 I think... the porting works in three parts: writing the structures which define the data formats and register layout (e.g. is int 32 bit or 64 etc and which registers exist and/or depend on other registers)). Then translating the intermediate code statements into assembly language for whichever architecture. Then writing a peephole optimizer to further clean up the generated code.

If I were to be serious about it I would probably take some of the basic helper functions out of the 386 backend and make a library lol!

As well., the entire package is already designed with cross-compiling in mind, e.g. with recent changes to the compiler design you can write an assembler, then use the output of that work as the part of the compiler that generates object code.

As well olink is designed with cross-linking in mind (there is magic in the linker configuration files that hides this fact on windows wink wink) For cases where a target needs more than a simple flashed binary (e.g. to work with an OS loader for example) the DL*.exe architecture allows a method to translate the linker output. For example into ELF format? The main update to the tools aside from the compiler/Assembler would be to update the linker to handle 64-bit address constants, which is something I kinda hedged on in the original design.

All that said I love porting between architectures, it is probably one of my favorite parts of this. Which is why MSIL became so interesting...

Back in the day there was a 68K backend of CC386 which was received fairly well at the time...

Reasons I haven't gone down that path with orange C -

  1. lots to keep busy on x86
  2. The number of people who want ports has historically been low
  3. testing is a little harder
  4. I have it in the back of my mind that what I really want is to make the ADL files specify the compiler backend as well...
  5. I started down the road of doing #4 with the ARM architecture a few years ago but immediately got sidetracked into MSIL and then came back to finish C++!
  6. the number one reason is porting the runtime isn't so easy sigh. It would be about time to start rewrtiting all the assembly language code into C I guess...

@gitmensch back to your final question - there is no documentation on this at this point. But I would write some if someone were interested in working on this...

LADSoft avatar Dec 16 '19 17:12 LADSoft

the number one reason is porting the runtime isn't so easy sigh. It would be about time to start rewrtiting all the assembly language code into C I guess...

Probably should be done either way, as well as removing the old K&R C code that while technically is standards compliant is deprecated in its syntax, I understand that it's a lot of files but cleanup here is important, during this we can also rip out all of the extraneous, non-compliant files and setup clang-format for the libs as well because apparently it just isn't there. While we're at it we can update our support for Libcxx in terms of removing files that we shouldn't be supporting and properly updating the ones we do have for C++14 libcxx support.

In all, the RTL looks and acts like an absolute mess. Probably deserves it's own milestone just like Libcxx 8 (and now 9) support deserves it's own milestone.

chuggafan avatar Dec 16 '19 17:12 chuggafan

Interesting ... I am now only working with IBM Power servers, ppc64 and ppc64el, it's an incredevel architecture and really "Power"!!

the architecture is growing a lot these days, especially after the release of IBM Power9 with AI support, which has intricacy in CPU instructions for AI... Another point I see for OrangeC would be to port the linker to generate ELF and Shared Object files and also compile the compiler and other tools for Linux and AIX !

The Power architecture, because it is RISC, has many different things, including stack alignment, which is negative, unlike intel architecture, which is positive.

@LADSoft I don't remember, but I think you even programmed at Assembler in Mainframes, correct? So it's not much different ...

I'll show you some of the differences ... between ppc64

Let's build this 3 simple functions:

int t1() {}

int t2(int a, int b, int c) {}

int t3(int a, int b, int c, int d, int e, int f, int g, int h, int i) {}

The generated code for each function is:

t1:
        std 31,-8(1)
        stdu 1,-48(1)
        mr 31,1
        nop
        mr 3,9
        addi 1,31,48
        ld 31,-8(1)
        blr
        .long 0
        .byte 0,0,0,0,128,1,0,1
t2:
        std 31,-8(1)
        stdu 1,-64(1)
        mr 31,1
        mr 8,3
        mr 10,4
        mr 9,5
        stw 8,32(31)
        stw 10,36(31)
        stw 9,40(31)
        nop
        mr 3,9
        addi 1,31,64
        ld 31,-8(1)
        blr
        .long 0
        .byte 0,0,0,0,128,1,0,1
t3:
        std 31,-8(1)
        stdu 1,-48(1)
        mr 31,1
        mr 11,3
        mr 3,4
        mr 4,5
        mr 5,6
        mr 6,7
        mr 7,8
        mr 8,9
        mr 9,10
        stw 11,80(31)
        stw 3,88(31)
        stw 4,96(31)
        stw 5,104(31)
        stw 6,112(31)
        stw 7,120(31)
        stw 8,128(31)
        stw 9,136(31)
        nop
        mr 3,9
        addi 1,31,48
        ld 31,-8(1)
        blr
        .long 0
        .byte 0,0,0,0,128,1,0,1

Look at the top of each function, we will see this:

stdu 1,-48(1)
stdu 1,-64(1)
stdu 1,-48(1)

This instruction is aligning the stack for each function ... if they are declared more variable within the function, the calculation of that offset changes ... Since there is something similar to generate the .Net assemblies, this would not be much problem, just make calculations based on the size of each variable ...

Anyway, removing as much of the runtime assembly code as possible is a big help, as porting to other architectures such as ARM, Sparc, or s390x, ppc, ppc64, or ppc64el is much simpler.

I remember when I first gave David the idea of implementing a backend to generate MSIL code, we discussed it ... and started implementing the entire C runtime in C#. But again, I apologize to David for not being able to give the support I wanted to give to the project due to lack of time.

Currently, I could help David and anyone else interested in running the PowerPC backend project by providing access to my IBM Power servers for both AIX and Linux ( maybe, OS/400 too... ). And in the case of s390x architecture, I can IPL the mainframe that I have here at home, install Linux on it and leave it for testing ...

bencz avatar Dec 16 '19 22:12 bencz

And, for the ppc64el, it's possible to use this architecture for free on Minicloud:

https://minicloud.parqtec.unicamp.br/

Just register, wait for approval and use! These servers are maintained by the University of Campinas, which is about 100km from the city I live.

bencz avatar Dec 16 '19 22:12 bencz

and also, it's important to consider something VERY important ... which is the endianness of each CPU.

ppc and ppc64 is Big Endian ppc64el is Little Endian

So, maybe, to build and run OrangeC on some ppc or ppc64 linux, will be necessary to make some corrections on the compiler source to handle the endiannesscorrectly

bencz avatar Dec 17 '19 00:12 bencz

@chuggafan, yeah the RTL is a complete mess, I agree. The basic parts of it were written when I was much younger and are probably now the oldest part of the package. At that time I was just starting on CC386 and had a very small code base as well as not much industry experience with software... and it kinda shows...

Also if LIBC++ is going to version 9 maybe we should skip version 8 in whichever milestone that got slated for and go directly to 9?

@bencz well maybe PPC would be good at some point but there are already a bunch of milestones and I don't want to add more just yet... although I might consider bumping the one about the IDE for something like this lol! Still thanks for the offer of letting us 'borrow' a developement platform for ppc.

Yeah I dealt with the big endian thing when doing the 68K years ago. As I recall the only practical change to the compiler frontend for it had to do with properly calculating the offsets for parameters (because for example a 'char' gets pushed as an 'int' but reversing the endianness effectively forces you to add three to the place where the 'char' is stored (32 bit architecture). That is if storing it as a 'char'.... now with the current intermediate code such things could probably be done in the back end on an as-needed basis. But even so little things like that are easy to accomplish in a portable way in the front end, much easier than all the 'patches' I had to make to the front end for msil lol!

Hm that brings up an interesting question though... did I write olink to be endian-friendly? I believe the file object format has an endianness specifier (Ithink?) but I can't remember that detail. Oh well maybe something else to do...

LADSoft avatar Dec 17 '19 02:12 LADSoft

@chuggafan @GitMensch @LADSoft and @bencz

Thanks for the feedback! As I understand it, Bencz came up with the idea of developing the backend for MSIL, thank you for the idea and David, thank you so much for developing this backend!

With regards to PowerPC, there are currently only 2 open source compilers, which I know of, that generate code for PowerPC, which is clang and gcc. As I understand it, it would be necessary to make some changes to the linker and the compiler itself and especially the C runtime, which is currently only compatible with Windows, a few months of work ahead with this!

Well, thank you all for the explanations, if in case the compiler ever supports Power architecture, I will be willing to compile my projects on my servers for testing!

neosaldina avatar Dec 20 '19 08:12 neosaldina

not sure how this got dragged into the current milestone, wasn't my intention to do it immediately... I'm taking it back out...

LADSoft avatar Apr 26 '20 18:04 LADSoft

I'm starting to think about this as part of milestone 6... here is an embedded board I found that might help me testing:

https://www.embeddedplanet.com/products_list/rpx-lite-computer-board-freescale-powerquicc-for-mpc823e-or-mpc850/

wondering if anyone else has any suggestions? I think it might be good to get something with LINUX on it to make the backend generic...

LADSoft avatar May 08 '20 12:05 LADSoft

@LADSoft This is great news!!!

I think more recommended to buy a PowerMac G5 or G4.... and install the Debian Port for ppc64: https://cdimage.debian.org/cdimage/ports/2020-04-19/

The price of these old computers in the USA is very low ...

Now, for ppc64le, I recommend you to use the MiniCloud: https://openpower.ic.unicamp.br/minicloud/

If you want to make some tests, I can power on my PowerMac G5, with Debian, and share the SSH connection with you

bencz avatar May 08 '20 13:05 bencz

and a free alternative, is to use qemu and install Debian ppc64 on VM Here, the qemu command line that I use:

REM ----------- INSTALATION -------------
qemu-system-ppc64.exe -boot d -hda debian_ppc64.qcow2 --cdrom debian-10.0-ppc64-NETINST-1.iso -m 2048 -cpu power7 -smp 4 -nic user,model=virtio,hostfwd=tcp::2222-:22

REM ----------- AFTER INSTALATION -------------
qemu-system-ppc64.exe -hda debian_ppc64.qcow2 -m 2048 -cpu power7 -smp 4 -nic user,model=virtio,hostfwd=tcp::2222-:22

bencz avatar May 08 '20 20:05 bencz

@bencz

I looked at the G4/G5... seems as if the G4 is 32 bit so it isn't viable for this? But I liked the price point...

hm how good is qemu? I know the x86 VMs sometimes skip a lot of the access checks so you can run things on the VM that don't work in the real world... probably only an issue for raw OS like DOS though.

LADSoft avatar May 09 '20 03:05 LADSoft

QEMU apparently is pretty good, I know it's currently used for compiling mario64 (because IRIX isn't a thing) in the romhacking community.

I wouldn't consider it awesome for cycle-accuracy however, all I know is that it's "decent", not "perfect".

chuggafan avatar May 09 '20 11:05 chuggafan

@LADSoft I had forgotten that the G4 is 32bits ... :/ The biggest problem with the Mac G5 is that it drains energy!!! ahahhah :/

I never had any problems with QEMU, in relation to that ... the only problem I had was in relation to opcodes that exist in newer CPUs but do not exist in older CPUs ...

bencz avatar May 09 '20 12:05 bencz

Just want to throw this out there, apparently IBM is allowing open-source usage for their ppc64le arch, so CI builds if this ever gets done will be theoretically possible without buying a Z mainframe or finding a PPC chip that meets real-world expectations. So that's one problem that'd theoretically block this down.

chuggafan avatar Nov 12 '20 20:11 chuggafan

Other than Travis-CI viaGitHub you can also directly access power, for details see https://power-developer.mybluemix.net/#hardware

GitMensch avatar Nov 12 '20 22:11 GitMensch

So i have most of the pieces for this now just a matter of getting to it.

One thing I'm puzzled about on the technical side is that millicode is used for division/some multiplication.... I don't know anything about millicode. Does it come originally on the chip? Or is it something maintained by teh OS? And how do you get access to it?

LADSoft avatar Nov 13 '20 18:11 LADSoft

you don't need to worry with the millicode.. there is some instructions for this.... look:

https://godbolt.org/z/do373K

bencz avatar Nov 13 '20 18:11 bencz

https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/assembler/idalangref_mul_multiply_instrs.html

There is some multiplication instructions....

bencz avatar Nov 13 '20 18:11 bencz

I'm good with that personally, but the reason I asked is I was reading an AIX assembler manual for PPC and it stated that use of the MUL and DIV instructions was much less efficient than using the millicode, and you should use the millicode?

LADSoft avatar Nov 13 '20 19:11 LADSoft

What manual you are reading ?? Where I know ( and I asked to a friend too ), it's not possible to generate the millicode... so, you need to use the normal opcodes... mul, mulw ...

bencz avatar Nov 13 '20 19:11 bencz

it was an AIX manual... I can dig it up if you want. Maybe the comment was specific to AIX! It did say something about the code being accessed through the kernel.

LADSoft avatar Nov 13 '20 19:11 LADSoft