gcc-ia16 icon indicating copy to clipboard operation
gcc-ia16 copied to clipboard

General discussion regarding this platform?

Open ladmanj opened this issue 4 years ago • 39 comments

Hello

I know, this isn't bug report, i don't want to be annoying. Is there any other way to contact the community involved in ia16 asm and gcc topic?

Thank everyone 😊

ladmanj avatar Nov 23 '20 11:11 ladmanj

Hello @ladmanj

Looks like we are both hacking a non-PC IA16 device 😉

mfld-fr avatar Nov 23 '20 11:11 mfld-fr

Hello @ladmanj,

Is there any other way to contact the community involved in ia16 asm and gcc topic?

I am not sure, but perhaps VOGONS might be a good place? It seems to be mostly geared towards IBM PC-compatibles (with MS-DOS or old Windows versions), and old Macs --- but you might have some luck there. I have not yet joined myself, but I may in the future.

Generic, platform-independent issues about GCC can probably use one of the official GCC mailing lists, e.g. gcc-help. Issues specific to gcc-ia16 can come here, for lack of a better place. 😑

There is of course also Stack Overflow, but you probably already know that.


Hello @mfld-fr,

Looks like we are both hacking a non-PC IA16 device :wink:

This to me brings up a question: do you think there is utility in having gcc-ia16 directly support writing programs to be burnt onto EPROMs? To me this seems to be an extremely niche area (even more so than writing PC booter programs), but maybe I am missing something.

Thank you!

tkchia avatar Nov 23 '20 13:11 tkchia

Hello @ladmanj,

By the way, @mfld-fr wrote an emu86 emulator which helps debug x86-16 code meant for EPROMs. You may find it helpful.

Thank you!

tkchia avatar Nov 23 '20 13:11 tkchia

@tkchia : I think you are right, hard bare metal on IA6 is only for a very few. GCC-IA16 already does the job, at the cost of a little but must have assembly code for the glue...

mfld-fr avatar Nov 23 '20 16:11 mfld-fr

Hello @tkchia and @mfld-fr,

do you think there is utility in having gcc-ia16 directly support writing programs to be burnt onto EPROMs?

When the ELKS kernel is configured for far text, the gcc-ia16 output is not usable when directly burned in ROM because of the required text relocations. In non-ROM scenarios, this is handled by setup.S when copying from the disk image. A nice addition would be having the ability to specify a final text segment with the linker or elf2elks, which would relocate the .text and .fartext sections and emit a short-form a.out header. Without it, a separate utility will have to be written to allow the ELKS fartext kernel to be ROMable. This could be somewhat easily tested by setting the final text segment to be REL_SYS, in which case the normal ELKS boot/setup.S should load it successfully without modification. I am much less concerned with data segment relocations, as haven't been any so far, and likely could be worked around if generated.

With @mfld-fr's EMU86 now able to show that the current ELKS kernel will run in ROM, this would be nice to have.

Thank you!

ghaerr avatar Nov 23 '20 17:11 ghaerr

Hello @tkchia,

By the way, @mfld-fr wrote an emu86 emulator which helps debug x86-16 code meant for EPROMs. You may find it helpful.

I will definitely try it

Hello @mfld-fr

Looks like we are both hacking a non-PC IA16 device wink

Welcome my brother :-)

ladmanj avatar Nov 23 '20 18:11 ladmanj

Hello all

I'm trying to use the technique from here https://sourceware.org/binutils/docs/ld/Output-Section-LMA.html#Output-Section-LMA

But I'm getting these errors:

(.text._init_mem+0x5): undefined reference to `_data!'
(.text._init_mem+0x8): undefined reference to `_etext!'
(.text._init_mem+0xc): undefined reference to `_data!'
(.text._init_mem+0xf): undefined reference to `_edata!'
(.text._init_mem+0x14): undefined reference to `_bstart!

I don't understand why the linker doesn't behave as the documentation is describing. Probably different versions of gnu linkers differ in behaviour.

Thanks

ladmanj avatar Nov 23 '20 21:11 ladmanj

OMG, there is the same example with different syntax - i will try it immediately, but I feel I should warn you. https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_chapter/ld_3.html

Thanks

ladmanj avatar Nov 23 '20 22:11 ladmanj

OMG, there is the same example with different syntax - i will try it immediately, but I feel I should warn you. https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_chapter/ld_3.html

Thanks

But the result doesn't differ :-(

ladmanj avatar Nov 23 '20 22:11 ladmanj

Hello @ghaerr,

A nice addition would be having the ability to specify a final text segment with the linker or elf2elks, which would relocate the .text and .fartext sections and emit a short-form a.out header. Without it, a separate utility will have to be written to allow the ELKS fartext kernel to be ROMable.

Let me look into that. Thanks!

tkchia avatar Nov 24 '20 05:11 tkchia

Hello @tkchia

Let me look into that. Thanks!

Please also kindly look at the linker code. I'm afraid, that there may be a bug.

For example, the function SIZEOF(.text) in the linkerscript returns wrong value. There is 0x85, and the actual .text code size is around 0xcb0 (not exactly, I'm not looking at it right now, but the order of magnitude is correct).

Thank you 😊

ladmanj avatar Nov 24 '20 09:11 ladmanj

Hello @ladmanj,

But I'm getting these errors:

For example, the function SIZEOF(.text) in the linkerscript returns wrong value.

Please give more information about what you were trying to do. Because, unfortunately, I cannot read minds, and I especially cannot read minds from a distance. I cannot guess what exactly you were doing, based only on what you just said.

What would be useful is if you give me your complete input(s) and your complete command line(s) that led to the problems --- enough information so that the rest of us can reproduce the problem on our setups.

If for whatever reason you cannot send your entire inputs, try to come up with smaller inputs that exhibit the same problems you are facing, and send those.

Thank you!

tkchia avatar Nov 25 '20 10:11 tkchia

Hello @ladmanj,

But I'm getting these errors:

For example, the function SIZEOF(.text) in the linkerscript returns wrong value.

Please give more information about what you were trying to do. Because, unfortunately, I cannot read minds, and I especially cannot read minds from a distance. I cannot guess what exactly you were doing, based only on what you just said.

What would be useful is if you give me your complete input(s) and your complete command line(s) that led to the problems --- enough information so that the rest of us can reproduce the problem on our setups.

If for whatever reason you cannot send your entire inputs, try to come up with smaller inputs that exhibit the same problems you are facing, and send those.

Thank you!

Ok, this is the whole project. Please read the few comments in gdbstub.ld.in first. In the files aren't archived all the attempts I've made, these are lost now.

ia16_ld_problem.zip

Thank you

ladmanj avatar Nov 25 '20 15:11 ladmanj

Hello @ladmanj,

I have uploaded a modified Makefile and a modified gdbstub.ld.in for your code, which should help you get going a bit further.

A few things:

  • The -msegelf scheme is unfortunately still not supported by nasm. (segelf was actually proposed by none other than the nasm maintainer himself, Mr. Anvin, but he has not yet implemented it there.) So if you want to link with nasm code, you have to use -mno-segelf for now.
  • When specifying memory areas, you need to always explicitly think about the offsets (in the segment:offset addressing scheme) that you want your code to use, even if the offsets will simply be 0. You cannot just say 0xfa700 and expect the linker to magically guess whether it means 0xf000:0xa700, or 0xfa70:0x0000, or something else. (The linker does not know how to read minds either...)
    • See the MEMORY command in the modified linker script to see how to represent segment:offset's for the -mno-segelf scheme.
  • If you use the "small" memory model (-mcmodel=small), you can have your data segment separate from your text segment (mostly this means ds = sscs). However, the read-only data (.rodata) should be addressable from the same data segment. Again, as always, you must think explicitly about offsets within segments when figuring out how to lay out your code and data segments.
  • Also, if you are using nasm, you probably also have the disassembler ndisasm installed. Use it! Run it on your output file to get an idea of what code is generated.

Thank you!

tkchia avatar Nov 25 '20 17:11 tkchia

Thank you @tkchia.

You mentioned that nasm is not fully compatible with ia16 toolchain, can you recommend me which other assembler to use? I have no preference, I'm using nasm, because the author of the gdbstub I'm forking chosen that. Only the at&t syntax is not my cup of tea, but i can get familiar wit that too.

Best regards

Hello @ladmanj,

I have uploaded a modified Makefile and a modified gdbstub.ld.in for your code, which should help you get going a bit further.

A few things:

* The `-msegelf` scheme is unfortunately still not supported by `nasm`.  (`segelf` was actually proposed by none other than the `nasm` maintainer himself, Mr. Anvin, but he has not yet implemented it there.)  So if you want to link with `nasm` code, you have to use `-mno-segelf` for now.

* When specifying memory areas, you _need_ to always explicitly think about the offsets (in the segment`:`offset  addressing scheme) that you want your code to use, even if the offsets will simply be 0.  You cannot just say `0xfa700` and expect the linker to magically guess whether it means `0xf000:0xa700`, or `0xfa70:0x0000`, or something else.  (The linker does not know how to read minds either...)
  
  * See the `MEMORY` command in the modified linker script to see how to represent segment`:`offset's for the `-mno-segelf` scheme.

* If you use the "small" memory model (`-mcmodel=small`), you _can_ have your data segment _separate_ from your text segment (mostly this means `ds` = `ss` ≠ `cs`).  However, the read-only data (`.rodata`) should be addressable from the same data segment.  Again, as always, you must think explicitly about offsets within segments when figuring out how to lay out your code and data segments.

* Also, if you are using `nasm`, you probably also have the disassembler `ndisasm` installed.  Use it!  Run it on your output file to get an idea of what code is generated.

Thank you!

ladmanj avatar Nov 25 '20 17:11 ladmanj

Hello @ladmanj,

You can use nasm, just not with -msegelf for now, because this option is implemented only in the gcc-ia16 toolchain.

Thank you!

tkchia avatar Nov 25 '20 17:11 tkchia

Hello @tkchia,

Regarding the Makefile you modified for me - you have deleted the long paths to CC and other tools - of course to be able use your own installation.

I have the long paths because i was unable to 'make install' the tools I've built by your script from build-ia16 on ubuntu 20.10, so I'm pointing to the folder where it was built.

If you know what the issue can be, please help. But I don't remember the particular step which failed and you also don't have a crystal ball as you mentioned few times :-)

Thanks

ladmanj avatar Nov 25 '20 17:11 ladmanj

Hello @ladmanj,

You can use nasm, just not with -msegelf for now, because this option is implemented only in the gcc-ia16 toolchain.

Thank you!

Hi I did understand it the first time, but the question was meant as "do you think I've better use some other asm and stay with -msegelf, and which one will be the recommended one?".

EDIT: Oh I see now, only the as contained in ia16 can be used with -msegelf, because ... this option is implemented only in the gcc-ia16 toolchain.

I'm sorry.

Thank you

ladmanj avatar Nov 25 '20 17:11 ladmanj

Hello @ladmanj,

If you know what the issue can be, please help.

No, there is no problem on your end. You can directly add .../prefix/bin --- which is where build-ia16 "installs" everything --- to your $PATH.

Alternatively, if you are using Ubuntu 18.04 or 20.04, I would recommend that you simply install the packages in my Ubuntu PPA. These should be much more stable. Also, these will be installed in the usual places (/usr/bin/ etc.).

Thank you!

tkchia avatar Nov 25 '20 17:11 tkchia

Alternatively, if you are using Ubuntu 18.04 or 20.04, I would recommend that you simply install the packages in my Ubuntu PPA. These should be much more stable. Also, these will be installed in the usual places (/usr/bin/ etc.).

Unfortunately not, I'm usually an early adopter and have already switched to ubuntu 20.10.

I will do the manual PATH modification - good point.

Thanks

ladmanj avatar Nov 25 '20 17:11 ladmanj

Hi @tkchia,

Oh this is great!!! I had no idea of this linker capabilities, but I was asking for a documentation how the offsets are correctly defined.

MEMORY
{
	// These are for absolute physical addresses
	ram (!RX) : ORIGIN = DATA_SEG * 0x10 + DATA_OFFSET,
		    LENGTH = DATA_LENGTH
	rom (RX) : ORIGIN = BASE_SEG * 0x10 + BASE_OFFSET,
		   LENGTH = BASE_LENGTH 

	// These are for offsets relative to the respective segments
	ramoff (!RX) : ORIGIN = DATA_OFFSET, LENGTH = DATA_LENGTH
	romoff (RX) : ORIGIN = BASE_OFFSET, LENGTH = BASE_LENGTH 
}

Thank you!

ladmanj avatar Nov 25 '20 18:11 ladmanj

Hello @tkchia,

Your Makefile and linkersctript seem to be great, but there is an problem left.


; Initialize rw data
initmem:
    mov		di,_data
    mov		cx,_edata
    sub		cx,_data
    mov		si,_etext
    rep		movsb

; Clear bss
    mov		di,_bstart
    mov		cx,_bend
    sub		cx,_bstart
    mov		al,0
    rep		stosb
    retn

Leads to:


00000035 <initmem>:
  35:   bf 00 14                mov    di,0x1400
  38:   b9 1d 14                mov    cx,0x141d
  3b:   81 e9 00 14             sub    cx,0x1400
  3f:   be 54 00                mov    si,0x54
  42:   f3 a4                   rep movs BYTE PTR es:[di],BYTE PTR ds:[si]
  44:   bf 1e 14                mov    di,0x141e
  47:   b9 60 14                mov    cx,0x1460
  4a:   81 e9 1e 14             sub    cx,0x141e
  4e:   b0 00                   mov    al,0x0
  50:   f3 aa                   rep stos BYTE PTR es:[di],al
  52:   c3                      ret    
  53:   90                      nop

This is almost OK, only 3f: be 54 00 mov si,0x54 is a nonsense. 0x54 isn't the LMA of the .data section, where the in linkerscript defined symbol _etext should point to.

Interresting fact is, that in the binary file which I'm getting by ia16-elf-objcopy -j .text* -j .rodata* -j .data* --output-target=binary gdbstub.elf gdbstub.bin the start of initialization data seems to be right after the .text and .rodata content.

The computation of _etext = . ; by linker seems to be wrong as I was concerned already yesterday.

Thank you

ladmanj avatar Nov 25 '20 23:11 ladmanj

Hi @tkchia

I have partly succeded with

SECTIONS {

		.text :
		{
		*(.text) ; *(.text*) ;
		_etext = . ;

		} >romoff AT>rom

		.data : 
		{
		_data = . ;
		*(.rodata) ; *(.rodata*) ;
		*(.data) ; *(.data*) ;
		_edata = . ;
		} >ramoff AT>rom

		.bss :
		{
		_bstart = . ;
		*(.bss) *(COMMON)
		_bend = . ;
		} >ramoff AT>ram

The addresses in disassembly are looking really good, and the initialization data is where it should be in the ROM image. But this function still doesn't work and i can't see any error:

#define EOF (-1)
const char digits[] = "0123456789abcdef";

char dbg_get_digit(int val)
{
	if ((val >= 0) && (val <= 0xf)) {
	  return digits[val];
	} else {
		return EOF;
	}
}

Compiled as:

00000133 <dbg_get_digit>:
 133:   89 e3                   mov    bx,sp
 135:   8b 5f 02                mov    bx,WORD PTR [bx+0x2]
 138:   b0 ff                   mov    al,0xff
 13a:   83 fb 0f                cmp    bx,0xf
 13d:   77 04                   ja     143 <dbg_get_digit+0x10>
 13f:   8a 87 00 14             mov    al,BYTE PTR [bx+0x1400]
 143:   c3                      ret    

The string is at this address: 00001400 g O .data 00000011 digits

EDIT: Both SS, DS, ES are set to 0xff00.

I can't understand what's wrong. I will run it in simulator next evening, but now it's almost 3 AM and I must go to sleep immediately.

Thank you.

ladmanj avatar Nov 26 '20 01:11 ladmanj

Uff, this one is solved, originally i didn't realize that the init data next after the .text is reachable at by cs not the default ds.

initmem:
; Initialize rw data 
    mov	di,_data
    mov	cx,_edata
    sub	cx,_data
    mov	si,_etext 
    rep	cs movsb		<--- magic cs here

; Clear bss
    mov	di,_bstart
    mov	cx,_bend
    sub	cx,_bstart
    mov	al,0
    rep	stosb
    retn

Yes, completely my fault - but completely invisible for me :-(

Thank you for patience.

ladmanj avatar Nov 26 '20 23:11 ladmanj

Hi all For your interest, I'm able now to do single instruction step, continue and memory examination in GDB, connected via serial port to gdb-server running on my hardware.

I have still problems continuing from different than original address of interruption and register updates, but maybe I will solve it soon. Regards

ladmanj avatar Nov 28 '20 00:11 ladmanj

@ladmanj : looks like you are making progress. :-)

tkchia avatar Nov 28 '20 12:11 tkchia

Hi all

Please, what's the best way to read byte from arbitrary memory address from the C function (using inline asm) addressed by 32 bit integer?

I was trying to split the 32bit integer (with 19 or 20 valid bits) to segment and offset and then inline asm to push ss and load it with the segment, then use ordinary 16bit pointer access in C and then inline asm to pop the original ss, but I failed to do so.

I don't understand the inline asm syntax, even looking to gcc manual :-( https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#OutputOperands

Thanks for help

Jakub

ladmanj avatar Dec 29 '20 00:12 ladmanj

Hi

Maybe I should use far pointers and leave it in C only, but how? I can't find any documentation again what's the usage of such pointers :-(

Thanks J.

ladmanj avatar Dec 29 '20 08:12 ladmanj

Hi It seems that *val = *(__far volatile char *)addr; is working, but I don't know if it's correct or not. J.

ladmanj avatar Dec 29 '20 08:12 ladmanj

Hello @ladmanj ,

It seems that *val = *(__far volatile char *)addr; is working, but I don't know if it's correct or not.

That should be fine. Note that addr here needs to be a segment:offset pair expressed as a 32-bit integer --- high 16 bits are the segment, low 16 bits are the offset.

addr should not be a flat absolute address. If you want to read from a flat absolute address, you must transform it into a segment:offset form first.

I don't understand the inline asm syntax, even looking to gcc manual :-( https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#OutputOperands

Basically you specify these things:

  • the form of the instruction(s)
  • the output operands --- the C variables, and how they should map to assembly operands
  • the input operands and their mappings
  • any registers or memory that may be destroyed.

This is a very powerful inline assembly facility --- it allows inline assembly to jive with GCC's optimizer. But admittedly yes, it is a bit hard to use: you need to specify the inputs, outputs, and clobbers very precisely so that GCC does not produce incorrect code.

If you are not sure about using inline assembly, just write a separate assembly language module.

push ss and load it with the segment,

Erm, you probably do not want to mess with %ss.

Thank you!

tkchia avatar Dec 29 '20 14:12 tkchia