elks
elks copied to clipboard
panic: No init or sh found (again)
Description Hi, I've tried ELKS 0.5 and 0.6 on my 8088 with 256KB of RAM. It is a Soviet clone of the actual 8088 CPU named К1810ВМ88. I observed a problem similar to the one mentioned in https://github.com/jbruchon/elks/issues/288.
Computer description
- The computer is called МК-88.01 where .01 denotes a variant with 256K RAM and FDD
- CPU: КР1810ВМ88 is a clone of Intel 8088 at frequency 4.77MHz
- Video RAM 128K
- Programmable interval timer: КР580ВИ53 equivalent to Intel 8253. Max frequency 2MHz
- Programmable interrupt controller: КР1810ВН59А equivalent to Intel 8259
- Floppy disk controller: UMC UM8272A analog of Intel 8272
- Controller for cassette recorder: КР580ВВ55А analog of Intel 8255
- There is one floppy drive installed (720K) which is used as 360K There is LPT and Joystick.
- No HDD, no COM port, no Soundcard.
Configuration
- I tried precompiled ELKS 0.5 with xms fix
- Also I compiled the latest master version (75a7cb7) myself (precompiled 0.6 didn't even start booting)
Raw data In both cases I get the same result shown in screenshots:
- 0.6 master:
- 0.5 with xms fix
P.S. 0.6 master boots nicely in qemu. Best
Hello @Vutshi,
Thanks for the problem report. I think the problem is your machine doesn't have enough memory for the default distribution, with only 256K RAM. As can be seen from the boot log, the system only has 123K free RAM for application programs after the kernel is loaded.
It is a Soviet clone of the actual 8088 CPU named К1810ВМ88.
Interesting. Was this designed in Russia from just the Intel specs?
Looking at the screenshots, and the differences between the v0.5.0 and v0.6.0 boots, I'm not sure why the BIOS track read retry is occurring, we might want to set CONFIG_TRACK_CACHE off, although that will make the system slower. I also can't understand why the /dev/console open is failing, which occurs prior to trying to exec /bin/init. This could happen if the FAT disk has a physical /dev directory, as /dev/ is emulated on FAT.
To allow the system to run with 256K RAM, we need to lower the max kernel heap, which defaults to 64K. That will allow more RAM for a shell to run. We will also want to disable /bin/init, and force a small shell (/bin/sash) for the time being.
To do this, add the following line in elks/include/linuxmt/config.h (at the top will do):
#define SETUP_HEAPSIZE 4096 /* force kernel heap size if specified*/
Also, change the number of external buffers from 64 to 32 in .config using "CONFIG_FS_NR_BUFFERS=32".
Then, after recompiling the kernel using 'make kclean' and producing another 360K boot floppy, remove /bin/init and /bin/sh, and copy elkscmd/sash/sash to /bin/sh on the floppy. (The standalone shell requires less space, however sash must be manually copied to /bin/sh as the 360K floppy does not contain it by the default build).
[EDIT: I have simulated 256K RAM on ELKS on QEMU by modifying setup.S in arch_get_mem and have ELKS booting, using the above modifications. It looks like we might need a better way to build a floppy that does not use /bin/init and uses sash for /bin/sh. Trying to run with /bin/sh uses too much memory on 256K system. The system remains very right on RAM and can't run other applications though. We have a ROM version that works, but more modifications will be required in order to get the system RAM usage to work with 256k.]
Thank you!
Thank you @ghaerr for the detailed instructions. Now there is more free RAM but something is still not working.
Here is my .config
Is there anything else to be safely removed from the image?
Is there anything else to be safely removed from the image?
It appears that you may not have /bin/sash copied as /bin/sh on the floppy. I also can't duplicate the Can't open /dev/console error. Can you post a DIR (or ls -l) listing of your root directory and /bin of the floppy?
Your config also has the EXT buffers set to 64, against the recommendation above:
CONFIG_FS_NR_EXT_BUFFERS=64
Please re-read and check the above instructions carefully.
I fixed the external buffer part, now it reads
CONFIG_FS_NR_EXT_BUFFERS=32
Initially you were talking about "CONFIG_FS_NR_BUFFERS=32" is it a typo or these are some other buffers?
It appears that you may not have /bin/sash copied as /bin/sh on the floppy. I also can't duplicate the Can't open /dev/console error. Can you post a DIR (or ls -l) listing of your root directory and /bin of the floppy?
I definitely use sash. It is clearly working in qemu.
Just in case here are my latest config and the corresponding image with sash config_6.txt fd360-6_small.img.zip
The outcome on the hardware is still the same. dev/console doesn't open
I definitely use sash. It is clearly working in qemu.
Thanks for the image. I tried it on QEMU and it works, but the likely reason is that QEMU has 1M memory, and ELKS is reporting 520K RAM free. Here is your image running on QEMU:
I am working on a PR that will allow us to artificially limit the amount of RAM available to ELKS. I have that running, but still can't duplicate the Can't open console issue. I am hoping this has nothing to do with the clone CPU.
I have to say, this is quite strange. Is there a way you could build a MINIX image, instead of FAT (CONFIG_IMG_MINIX=y CONFIG_IMG_FAT not defined), so that we can see whether the failure of opening /dev/console has anything to do with the emulated FAT filesystem?
We are dealing with multiple problems and I'm trying to get my arms around a good debug scenario. We can't do much printk kernel debugging, since the output scrolls off after 24 lines. I am still guessing this has to do with limited RAM, but I now have all this running on QEMU with 256K limit and I still can't duplicate the real hardware open problem (which still should not affect the exec of /bin/sh, frankly).
Since your system may not be IBM compatible, we might want to eliminate any other kernel dependencies by setting the following in the CONFIG_ARCH_IBMPC section of include/linuxmt/config.h:
#define SYS_CAPS 0 /* no XT/AT capabilities */
Are there other ways the hardware you are running on may be different from IBM PC?
MY 2cents: Seems to me that after boot, elks is requesting 2 blks per read but getting only one. Bios problem? M
[ iPhone ]
- jun. 2022 kl. 22:30 skrev Gregory Haerr @.***>:
I definitely use sash. It is clearly working in qemu.
Thanks for the image. I tried it on QEMU and it works, but the likely reason is that QEMU has 1M memory, and ELKS is reporting 520K RAM free. Here is your image running on QEMU:
I am working on a PR that will allow us to artificially limit the amount of RAM available to ELKS. I have that running, but still can't duplicate the Can't open console issue. I am hoping this has nothing to do with the clone CPU.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.
Interesting. Was this designed in Russia from just the Intel specs?
I am not sure how exactly was it designed. I suspect it was reverse engineered from Intel CPU. As far as I know USSR pursued both strategies: cloning western devices (which come to an end in 90s) and building homegrown VLIW architecture (exists now as Elbrus 8S, 8SV, 16S).
Regarding the 8088 clone I am using I should add that it is quite shabby. One memory module had to be replaced, floppy connector replaced, there are some glitches in video ram (will be replaced too). Nevertheless, it boots the native OS (analog of MS-DOS called Альфа ДОС) and runs games like pacman (with small graphics glitches).
So one should not exclude a hardware problem behind the ELKS booting issue.
Looking further @Vutshi, I am finding a pretty stupid problem in ELKS .config files. It seems that merely commenting out options with '#', but leaving the value =y, causes the include/autoconf.h to be produced incorrectly. I noticed this because in your boot screen, it is saying "track read retry", yet you have CONFIG_TRACK_CACHE commented out (which isn't working):
# CONFIG_TRACK_CACHE=y
# CONFIG_BLK_DEV_BHD=y
# CONFIG_IDE_PROBE=y
The above doesn't work, look at what include/autoconf.h has to say:
#define CONFIG_TRACK_CACHE 1
#define CONFIG_BLK_DEV_BHD 1
#define CONFIG_IDE_PROBE 1
In order to correct you MUST set the above .config values to "# CONFIG_TRACK_CACHE is not used". I have just confirmed this by looking at the ELKS menuconfig/config and it actually string compares "is not used". UGH!! So this may be a contributing issue, we are not actually creating what is configured!!!
A potential fix for this is to either edit back manually, if "make menuconfig" doesn't do it automatically. Thus, we need to use a .config file with no commented-out "=y" values!
@ghaerr
Are there other ways the hardware you are running on may be different from IBM PC?
I don't really know. It is supposed to be identical. It definitely runs MS-DOS and games made for MS-DOS.
Here's a fixed .config file, who knows whether this might change something on your real hardware! config.txt.zip
Is there a way you could build a MINIX image, instead of FAT
This is what I wanted to try as well. The only reason I didn't do it so far is some Windows related problem with writing this image. I guess I need to install Ubuntu for writing images.
The only reason I didn't do it so far is some Windows related problem with writing this image.
That could also be an issue. We're finding out all sorts of strange things on your issue...
I guess I need to install Ubuntu for writing images.
That'd be great, since we need to eliminate variables. Given @Mellvik's comment about BIOS, and my finding that track read was turned on even though configured off, a BIOS issue with multi-sector reads could also be an issue.
I'm still working on better debug for very limited RAM systems, as that happens to be an interest of mine. I'd like to see us getting this working :) I hope to push a PR to help emulate 256K better on QEMU.
Here's a fixed .config file, who knows whether this might change something on your real hardware!
Thanks. I'll try it.
Is there a difference between
Yes (although there definitely should not be!!) - that's what my last post was saying. Unfortunately, the make config/menuconfig scripts are pretty dumb. I showed you what include/autoconf.h looked like, which incorrectly set the settings for the C code.
Is there a difference between
Yes (although there definitely should not be!!) - that's what my last post was saying. Unfortunately, the make config/menuconfig scripts are pretty dumb. I showed you what include/autoconf.h looked like, which incorrectly set the settings for the C code.
Yes. Sorry, I missed your explanation above. Messages appear faster than I write and read :)
@ghaerr Is there a way to put sash into minix image if my system doesn't understand minix filesystem?
I will be back tomorrow with new tests on hardware.
Is there a way to put sash into minix image if my system doesn't understand minix filesystem?
Currently, not an easy way. I'm working on a new option to copy sash to /bin/sh on build, as well as turn off automatic execution of /bin/init. There will also be options that allow us to emulate 256K in QEMU, so that we can debug all this lots easier. I'll post a PR shortly.
Actually, there is a way to do this - however clumsy.
What I've done in such cases is to boot the image in QEMU and have whatever files I need to add available on a second floppy image, mounted after boot. Then manipulate them in QEMU (and remember to sync before exit).
--M
- jun. 2022 kl. 00:06 skrev Gregory Haerr @.***>:
Is there a way to put sash into minix image if my system doesn't understand minix filesystem?
Currently, not an easy way. I'm working on a new option to copy sash to /bin/sh on build, as well as turn off automatic execution of /bin/init. There will also be options that allow us to emulate 256K in QEMU, so that we can debug all this lots easier. I'll post a PR shortly.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.
Hi @ghaerr and everybody,
I've got new results with the latest PR #1330 and the corresponding config file.
First of all I checked compatibility of my machine with x86 by booting MS-DOS 3.10. Works. DOS games work as well.
I wrote ELKS image in Ubuntu as follows
dd if=fd360-minix.img of=/dev/fd0 bs=2048
minix version:
fat version:
Hello @Vutshi,
Thanks for the screenshots and continued testing with the latest changes.
Looking at both screens, and seeing the results on MINIX (with a different errno). I think I see the problem: the floppy disk probe seems to be determine that your floppy has the format of 40 cylinders, 2 heads and 8 sectors. The 8 sectors is incorrect for 360K floppy! So what's happening is ELKS is reading the disks skipping a sector every 8 sectors, which is why nothing is running. The kernel is loaded using different code, which appears to be working.
Let me look at the probe code to determine why this might be happening.
Here is the code after probing in elks/arch/i86/drivers/block/bioshd.c:
got_geom:
if (drivep->cylinders == 0 || drivep->sectors == 0) {
*drivep = fd_types[drivep->fdtype];
printk("fd: Floppy drive autoprobe failed!\n");
} else {
drivep->sectors = 9; // <--- INSERT THIS LINE
printk("fd: /dev/fd%d %s has %d cylinders, %d heads, and %d sectors\n",
target,
(found_PB == 2)? "DOS format," :
(found_PB == 1)? "ELKS bootable,": "probed, probably",
drivep->cylinders, drivep->heads, drivep->sectors);
}
If you'd like to see if my theory is correct, insert the above line in the driver, and recompile. That should force the floppy sector count to be 9, and everything should work.
Thank you!
I looked at the probe code, and can't see much wrong with it (yet). Is your floppy a 320K floppy? (CHS 40,2,8)? That isn't supported by ELKS, although we could. It seems that perhaps your floppy might be 320k but configured for 360k ELKS?
@ghaerr it works! After hardcoding number of sectors I got it booted:
Super progress, thank you!
Now the problem is that system doesn't respond to pressing keys on my keyboard :)
I looked at the probe code, and can't see much wrong with it (yet). Is your floppy a 320K floppy? (CHS 40,2,8)? That isn't supported by ELKS, although we could. It seems that perhaps your floppy might be 320k but configured for 360k ELKS?
The floppy drive is physically 720K with 80 cylinders. However, the computer reads only every second cylinder thus effectively using it as 360K. Plus it is not in a very good shape, I have to press the drive head with my finger to help it read thoroughly:))
Now the problem is that system doesn't respond to pressing keys on my keyboard :)
Try setting "BIOS" under "Select Console Driver", that will use polled BIOS rather than IRQ 1 for keyboard.
I have to press the drive head with my finger to help it read thoroughly:))
Well, that's a new take on "having the system at your fingertips..." :)
I'm not sure whether the kernel probe routine may not be working (I did modify it a bit several months ago), or whether this has something to do with your drive. I will continue looking at this. We probably ought to add some printk code in the actual probe routine, to see what it's doing, perhaps something like the following:
/* Next, probe for sector number. We probe on track 0, which is
* safe for all formats, and if we get a seek error, we assume that
* the previous successfully probed format is the correct one.
*/
drivep->sectors = 0;
count = 0;
do {
/* skip reading first entry */
printk("probe %d\n", count); // <--- insert this line
if (count && read_sector(target, 0, sector_probe[count]))
{ printk("read sector failed on %d count %d\n", sector_probe[count], count); break; } // <--- change this line
drivep->sectors = sector_probe[count];
} while (++count < sizeof(sector_probe)/sizeof(sector_probe[0]));
However, the computer reads only every second cylinder thus effectively using it as 360K.
Hmmm, the probe routine tries on track 0. Are you saying that perhaps your drive only starts working on track 1? If probing on track 0 sector 8 failed, that would cause this problem.
How does the BIOS handle the every other track, does it automatically map track 0 -> 1, 1 -> 3, 2 -> 5 etc?
@ghaerr
How does the BIOS handle the every other track, does it automatically map track 0 -> 1, 1 -> 3, 2 -> 5 etc?
I don't really know how it works. Documentation I have doesn't say anything about this subtlety. Although, I do have ROMs data available if it can help.
The additional printk code gives the following output now:
Somehow it doesn't like sector 9.
Regarding keyboard:
Try setting "BIOS" under "Select Console Driver", that will use polled BIOS rather than IRQ 1 for keyboard.
This setting broke compilation for me:
kbd-scancode.c:136:5: error: ‘xtkb_scan’ undeclared here (not in a function)
xtkb_scan, /*mode = 0*/
^~~~~~~~~
kbd-scancode.c:137:5: error: ‘xtkb_scan_shifted’ undeclared here (not in a function)
xtkb_scan_shifted, /*mode = 1*/
^~~~~~~~~~~~~~~~~
kbd-scancode.c:138:5: error: ‘xtkb_scan_caps’ undeclared here (not in a function)
xtkb_scan_caps, /*mode = 2*/
^~~~~~~~~~~~~~
kbd-scancode.c:139:5: error: ‘xtkb_scan_ctrl_alt’ undeclared here (not in a function)
xtkb_scan_ctrl_alt, /*mode = 3*/
^~~~~~~~~~~~~~~~~~
make[3]: *** [../../../../Makefile-rules:243: kbd-scancode.o] Error 1
make[3]: Leaving directory '/home/denis/8088/elks/elks/arch/i86/drivers/char'
make[2]: *** [Makefile:197: drivers/char/chr_drv.a] Error 2
make[2]: Leaving directory '/home/denis/8088/elks/elks/arch/i86'
make[1]: *** [Makefile:75: Image] Error 2
make[1]: Leaving directory '/home/denis/8088/elks/elks'
make: *** [Makefile:13: all] Error 2
Hello @Vutshi,
This setting broke compilation for me: kbd-scancode.c:136:5: error: ‘xtkb_scan’ undeclared here (not in a function) xtkb_scan, /mode = 0/
This is a result of having "Scancode keyboard driver" (CONFIG_KEYBOARD_SCANCODE=y) set for BIOS keyboard driver, which won't work. Turn that off using make menuconfig and you should get a compiled system.
I'll submit a fix to remove the scancode keyboard driver when BIOS console is selected.
The additional printk code gives the following output now: Somehow it doesn't like sector 9.
Thanks for testing that. The printk display seems to show that the probe routine is in fact correct, while somehow the BIOS isn't reading track 0 sector 9 (in the probe routine only?). Very strange. I'm not quite certain that it isn't our probe routine, but I suspect this has something to do with your non-standard every-other-track floppy drive somehow. We can just leave that alone for now, until we get ELKS into a fully operational state on your system.
Thank you!
Hi @ghaerr,
This is a result of having "Scancode keyboard driver" (CONFIG_KEYBOARD_SCANCODE=y) set for BIOS keyboard driver, which won't work. Turn that off using make menuconfig and you should get a compiled system.
I'll try it in a couple of hours.
Meanwhile I have a suspicion that disabling the new options
CONFIG_SYS_DEFSHELL_SASH
CONFIG_SYS_NO_BININIT
does not restore the original sh for me. I think it is still sash, it has the same size and behaves accordingly.
Meanwhile I have a suspicion that disabling the new options CONFIG_SYS_DEFSHELL_SASH does not restore the original sh
You're right - fixed in #1337.
@ghaerr bios keyboard doesn't bring luck for me: It doesn't respond to a single key in English or Russian register
bios keyboard doesn't bring luck for me:bios keyboard doesn't bring luck for me It doesn't respond to a single key in English or Russian register
It seems perhaps we need to discuss further more exactly how your system BIOS and hardware differs from standard PC. The 360k/720k floppy drive appears to be functioning differently than PC in some respects, requiring the forced sector = 9 workaround; I suspect something is also amiss with keyboard.
The BIOS keyboard driver uses the standard IBM PC INT 16h function AH=0 and AH=1 to read keystrokes, which aren't working. The Direct console uses IRQ 1, which didn't work either. I don't know what to do, without looking further at BIOS source or documentation. You can look at elks/arch/i86/drivers/char/conio-bios.S, it will likely have to be changed to support whatever method your BIOS requires, it seems?
// int conio_poll
// INT 16h AH=00h (read kbd)
// INT 16h AH=01h (get kbd status)
// returns scan code in AH, ASCII char in AL
conio_poll:
mov $1,%ah // get kbd status
int $0x16
jnz 1f // key pressed
xor %ax,%ax
1: or %ax,%ax
jz 9f
xor %ah,%ah // read kbd scan/char
int $0x16
9: ret
Obviously, console output is working, this uses INT 10h function 0x0E.
I see. I'll try to dig available documentation.
P.S. PC-DOS 3.30 is somehow familiar with the keyboard:
Hi @ghaerr,
I have some new data regarding the keyboard problem. A friend of mine wrote me a simple test in asm (see below) to check INT 16h. The program interrogates the keyboard in a loop and prints what was typed in. It stops after pressing q
.
.text
.code16
0: mov $1,%ah
int $0x16
jnz 1f
xor %ax,%ax
1: or %ax,%ax
jz 0b
xor %ah,%ah
int $0x16
cmp $'q',%al
je 2f
mov $0x0e,%ah
int $0x10
jmp 0b
2:
cli
hlt
The test is written in the floppy disk boot sector. echo_v2.img.zip
At the end of the day, it works on my computer (this is v1 version of the program without stop key q
):
I wonder what can go different in ELKS BIOS keyboard driver?
The program interrogates the keyboard in a loop and prints what was typed in.
As you have probably seen, ELKS uses an identical method to poll for and read characters. So, it seems, the BIOS call itself is not the problem.
It just occurred to me that perhaps the problem is your system real time clock (RTC) isn't firing interrupts, or that interrupts in general aren't working. The BIOS keyboard driver uses the RTC to poll the keyboard using the code above, but only polls every 8ms rather than continually. If the RTC isn't working, the polling won't happen.
Can you tell us a bit more about your RTC hardware? It should normally be set up by the BIOS, but ELKS takes over and programs the device itself. The supported RTC is an 8254 chip, with addressed at the following in elks/include/arch/ports.h:
/* timer, timer-8254.c*/
#define TIMER_CMDS_PORT 0x43 /* command port */
#define TIMER_DATA_PORT 0x40 /* data port */
#define TIMER_IRQ 0 /* can't change*/
The timer code is in elks/arch/i186/kernel/timer-8254.c:
#define TIMER_MODE0 0x30 /* timer 0, binary count, mode 0, lsb/msb */
#define TIMER_MODE2 0x34 /* timer 0, binary count, mode 2, lsb/msb */
#define TIMER_LO_BYTE (__u8)(((5+(11931818L/(HZ)))/10)%256)
#define TIMER_HI_BYTE (__u8)(((5+(11931818L/(HZ)))/10)/256)
void enable_timer_tick(void)
{
/* set the clock frequency */
outb (TIMER_MODE2, TIMER_CMDS_PORT);
outb (TIMER_LO_BYTE, TIMER_DATA_PORT); /* LSB */
outb (TIMER_HI_BYTE, TIMER_DATA_PORT); /* MSB */
}
The keyboard polling code is in elks/arch/i86/drivers/char/kbd-poll.c:
static void kbd_timer(int data)
{
int dav, extra = 0;
printk("kbd poll\n"); // <- add this line
if ((dav = conio_poll())) {
printk("kbd_poll got %x\n", dav); // <-- add this line
if (dav & 0xFF)
Console_conin(dav & 0x7F);
else {
...
Add the above two lines and see whether "kbd_poll" loops on your keyboard. It should then say "kbd_poll got xxx" when a character is typed.
According to documentation МК-88 (my computer) has КР580ВИ53 chip which is equivalent to 8253 chip. Does it make a difference?
EDIT: Apparently, READ BACK command is missing in 8253, whatever it means.
We were using our own clones called Pravetz 16 or IZOT. And they were first reversed engineered and then with some improvements.
Apparently, READ BACK command is missing in 8253, whatever it means.
I'll have to check. Could you try running the test with the two lines inserted as described above? That'll tell us more about whether the RTC is the culprit here.
Apparently, READ BACK command is missing in 8253, whatever it means.
I'll have to check. Could you try running the test with the two lines inserted as described above? That'll tell us more about whether the RTC is the culprit here.
I'll do it in a few hours.
@ghaerr,
Could you try running the test with the two lines inserted as described above?
It gives me something new — computer freezes:
Maybe it didn't respond to the keyboard in the first place because it was stuck. Anyway, now it happens before booting is complete.
PC-98 also uses 8253.
Maybe it didn't respond to the keyboard in the first place because it was stuck. Anyway, now it happens before booting is complete.
Definitely strange. It seems we are getting a timer tick though. Perhaps comment out the first "kbd poll" printk, and leave 2nd one in, to see whether the kernel completes booting. I can't see why the boot would not complete with this in there. I'm wondering if we are having other issues with the amount of usable RAM?
PC-98 also uses 8253.
The PC-98 uses a different clock frequency, which @tyama501 is using to set the countdown register. However, we're not entirely sure that this is the reason for the problem yet. What is the clock frequency of your PC, do you know?
Something else which seems strange... is it just me or does the computer always seem to stop working when the ELKS cursor gets to the bottom line?
Perhaps change the first printk to printk(".");
Also notice in first printk, and also the second-to-last line in screenshot above: the "kbd poll" is missing the first letter: it says "bd poll". The TTY output drops the first letter, twice. This indicates something is quite amiss, it seems. I am beginning to wonder if the BIOS is trashing something or is incompatible with ELKS, for some reason.
Can you tell us more about your system? What is the programmable interrupt controller (PIC)? Is it an 8259? Are there other devices attached to it?
What is the BIOS, do you have a listing?
Perhaps change the first printk to
printk(".");
Did this and it helped a little bit. Now I can boot but only sometimes. The system is very unstable, it sends me bioshd(0)
messages seemingly randomly and floppy drive seems to be very busy.
Here is an example of "successful" boot meaning we can see #
but pressing keys doesn't provide the expected feedback. In the end it hangs:
Btw, here I switched to minix and completely turned off FAT support which gave me an amazing 166K of free RAM:)
This is another run which didn't reach #
:
What are these bioshd(0)
messages? I saw them even on my Intel Core 2 Duo running this test build of ELKS:
What is the clock frequency of your PC, do you know?
CPU is 4.77MHz The analog of i8253 used in this computer has maximum frequency of 2MHz.
What is the programmable interrupt controller (PIC)? Is it an 8259? Are there other devices attached to it?
I can check it tomorrow.
What is the BIOS, do you have a listing?
What is "listing"? I know that it is 8KB and I have the ROM data.
EDIT:
Something else which seems strange... is it just me or does the computer always seem to stop working when the ELKS cursor gets to the bottom line?
I would say the computer stops working at random times
What are these bioshd(0) messages? I saw them even on my Intel Core 2 Duo running this test build of ELKS:
These are BIOS floppy read retry messages: CHS 12/0/3 count 2 means read of cylinder 12, head 0, sectors 3&4 failed, and ELKS issued a retry. You're probably not holding your finger on the drive hard enough ;)
What is "listing"?
ASM source code.
When did the screen start displaying "kbd_poll got 2267"? This means that a keyboard character was received! Were you typing at the time?
The lower 8 bits of the "got xxxx" message indicate the hex value of the keyboard input received. From the screenshot, I see 79, 67, 68, 6b, 66, 67... this looks like a garbage ASCII sequence. Do the hex values seen mean anything to you, could they be unicode or scan codes?
When did the screen start displaying "kbd_poll got 2267"? This means that a keyboard character was received! Were you typing at the time?
Yes, I was typing. However, this was my standard desktop on Intel (!!!)
Here is what the screen should look like. Make sure you "make clean; make". Attached is also the config file I'm using. config.small.zip
Yes, I was typing. However, this was my standard desktop on Intel (!!!)
Well, if you were typing garbage like "xyowxy", then that is correct. You can look up the values in an ASCII Chart.
Here is what the screen should look like. Make sure you "make clean; make". Attached is also the config file I'm using.
Yes. This is how it looks like for me as well. The screen shot with "kbd poll got 2267" was done on Intel with the first version of the test. I included it just because of the bioshd(0) messages. Sorry, if I cause a confusion.
It seems we're getting multiple issues mixed up here. I suggest testing first on your desktop, and seeing if ELKS runs well or not. We know ELKS works well, so lets see whether your compilation of it works on your desktop (remove printk's discussed above). Your desktop has some sector retry issues with ELKS our the floppy you're using.
Then, you can add back in the printk's, and see what proper desktop looks like. As I mentioned, it is proper to display the hex values I described, depending on what you're typing.
Once both the above work, we can get back to debugging other system.
It seems we're getting multiple issues mixed up here. I suggest testing first on your desktop, and seeing if ELKS runs well or not. We know ELKS works well, so lets see whether your compilation of it works on your desktop (remove printk's discussed above). Your desktop has some sector retry issues with ELKS our the floppy you're using.
This one I checked already. Without printk on keyboard polling desktop works well with this configuration of ELKS. It is just as good as qemu.
Tomorrow I check that it returns correct hex values.
You're probably not holding your finger on the drive hard enough ;)
Btw, I have got now a new floppy drive. No more fingers required to boot ELKS or DOS :)
ASM source code.
Nope. No ASM source code for МК-88 BIOS
Then, you can add back in the printk's, and see what proper desktop looks like. As I mentioned, it is proper to display the hex values I described, depending on what you're typing.
Here is the Intel desktop reference reaction on typing "qwerty" with my test build of ELKS:
which is identical to the result in qemu:
I wonder why the size of printk message (printk(".");
vs printk("kbd poll\n");
) affects the booting behaviour on my 8088 computer. Is there overfilling of some buffer?
Can you tell us more about your system? What is the programmable interrupt controller (PIC)? Is it an 8259? Are there other devices attached to it?
Yes, it is an analog of 8259. There are not so many devices in the computer. LPT and Joystick come to mind, no HDD, no COM port, no nothing. I have updated the opening post with all information about the computer I have collected so far.
EDIT: One peculiar thing is that my МК-88 has a controller for cassette recorder which makes it different from IBM PC XT. There was one in IBM PCjr. "BIOS interrupt call 15[h] routines were documented in the technical reference manual that would turn the cassette motor on and off, and read or write data."
Hi @ghaerr and everyone,
I was poking around with some more low-level tests and found one strange thing about my computer. In particular, after seeing random behaviour of ELKS with additional printk()
messages we decided to try printing various messages from a test program loaded from the boot sector thus bypassing any OS.
It appears that the computer (BIOS?) doesn't like characters \r \n
. I can print them only ~20 times and then computer freezes seemingly when it has to scroll the screen. Without the 'bad' characters printing goes on forever. I found a potentially similar bug described in IBM PC BIOS versions from 1981. Btw, as I mentioned above, MS-DOS doesn't care about this \r \n
problem and works alright.
I did the following change to elks/blob/master/elks/kernel/printk.c#L62-L69
void kputchar(int ch)
{
if (ch == '\r' || ch == '\n')
return;
if (kputc)
(*kputc)(dev_console, ch);
else early_putchar(ch);
}
It seems like removal of the \r \n
characters help to restore the stability of ELKS booting. Now it works well, types dots like a champ until it needs to scroll... and it still refuses to read keyboard :)
https://user-images.githubusercontent.com/4971779/176764671-f2964776-a606-40b8-a9d6-cac9c720a59a.mp4
I noticed also that my computer can tolerate the unlucky characters \r \n
while working in a graphical regime. Is there a way to boot ELKS in such a regime?
Hello @Vutshi,
It appears then we have two big problems we need to work around for your system - the first is that the PC nearly crashes or becomes unresponsive when having to scroll, and secondly, the ongoing keyboard read problem.
For the first problem, it seems your BIOS has a buggy scroll routine? The ELKS boot block and early kernel setup use the INT 10h AH=0Eh to write to the console. In addition, the BIOS console uses this interrupt for console output.
You will probably have to rewrite this routine to do console output directly yourself (this is the same method that @tyama501 uses for PC-98). This means switching back to using the Direct Console (not BIOS console) for the time being, which will remain problematic because it uses a different method for keyboard I/O. Then, a rewritten console output routine would allow you to bypass using the buggy BIOS routine.
By switching back to Direct Console (CONFIG_CONSOLE_DIRECT=y), this problem may go away. Otherwise, it is possible to rewrite the console output routine and still use the BIOS console, as it may be more suited for fixing the keyboard, when we finally figure that out.
I would advise looking at the PC-98 ASM code for the rewritten console output routine in elks/arch/i86/drivers/char/conio-pc98-asm.S. It is a little complicated explaining exactly how we might get all this working, so I'll defer on that explanation until seeing which way you'd like to go.
This problem did not exist using Direct Console, correct?
Btw, as I mentioned above, MS-DOS doesn't care about this \r \n problem and works alright.
This is likely because MSDOS doesn't use the INT 10h function to display output. ELKS Direct Console doesn't either.
Thank you!