discussion
discussion copied to clipboard
Forth at the Vintage Computer Festival Europe April 29th and May 1st 2017 in Munich/Germany
The 18th annual European Vintage Computer Festival will take place on April 29th and May 1st 2017.
http://vcfe.org
I will be presenting the "Amitari" (an Amiga 600 running EmuTOS, the open source version of the Atari ST operating system), running some Forth-Systems (VolksForth, I also will try lbforth, oneForth, FusionForth ).
I'll also plan to test SoloForth with the help of a ZX Spectrum user.
Also, as each year, the great ultimate Forth Benchmark will continue.
Location: Kulturzentrum Trudering Wasserburger Landstraße 32 81825 Munich Bavaria
If you are in Bavaria next weekend, don't miss this show.
See you there
Carsten
There is one thing missing from lbForth to run it on a real TOS: relocatable executables. Due to limitations in my metacompiler, the executable must run from a fixed address.
I have only tested it in simulation using TOSEMU. TOSEMU always runs the binaries from address 0x900.
Hi Lars, thanks for letting me know. I will try (but not benchmark) lbforth in TOSEMU. I need to get used to TOSEMU anyway, I want VolksForth to run there as well. VolksForth still has some direct hardware access issues that I need to find and remove.
En/Je/On 2017-04-25 01:37, Carsten Strotmann escribió / skribis / wrote :
I'll also plan to test SoloForth with the help of a ZX Spectrum user.
Thank you for your interest, Carsten.
Solo Forth 0.14.0 is under very active development. I think it could be ready in one month. It includes many improvements. Besides, it is the first version that includes a manual and a glossary, in HTML.
The manual includes a clear guide how to use the disk images to run the system on a ZX Spectrum emulator, so even someone not acquainted with the platform could run it.
I publish the updated disk images only with version releases, and the Makefile that builds them has many requirements, so I think it will be easier for you if I upload to GitHub the disk images for G+DOS and TR-DOS (+3DOS support has a recent bug at the moment), plus the HTML manual, built from the latest sources, as an exceptional pre-release. Ok?
-- Marcos Cruz http://programandala.net
Sounds nice~!
On Tue, Apr 25, 2017 at 2:38 PM, Marcos Cruz [email protected] wrote:
En/Je/On 2017-04-25 01:37, Carsten Strotmann escribió / skribis / wrote :
I'll also plan to test SoloForth with the help of a ZX Spectrum user.
Thank you for your interest, Carsten.
Solo Forth 0.14.0 is under very active development. I think it could be ready in one month. It includes many improvements. Besides, it is the first version that includes a manual and a glossary, in HTML.
The manual includes a clear guide how to use the disk images to run the system on a ZX Spectrum emulator, so even someone not acquainted with the platform could run it.
I publish the updated disk images only with version releases, and the Makefile that builds them has many requirements, so I think it will be easier for you if I upload to GitHub the disk images for G+DOS and TR-DOS (+3DOS support has a recent bug at the moment), plus the HTML manual, built from the latest sources, as an exceptional pre-release. Ok?
-- Marcos Cruz http://programandala.net
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ForthHub/discussion/issues/44#issuecomment-297141617, or mute the thread https://github.com/notifications/unsubscribe-auth/AFC6xeQDVnzByqYPobbwoXYH_iWybiPgks5rzkudgaJpZM4NHKcE .
In case you are looking for Forths to benchmark, I would love to see https://github.com/jkotlinski/durexforth in the list :)
Hello Marcos,
On 25.04.17 21:38 PM, Marcos Cruz wrote:
En/Je/On 2017-04-25 01:37, Carsten Strotmann escribió / skribis / wrote :
I'll also plan to test SoloForth with the help of a ZX Spectrum user.
Thank you for your interest, Carsten.
Solo Forth 0.14.0 is under very active development. I think it could be ready in one month. It includes many improvements. Besides, it is the first version that includes a manual and a glossary, in HTML.
OK, so I'll postpone my tests for now.
I will not have access to a Speccy until next years VCFe.
The manual includes a clear guide how to use the disk images to run the system on a ZX Spectrum emulator, so even someone not acquainted with the platform could run it.
I publish the updated disk images only with version releases, and the Makefile that builds them has many requirements, so I think it will be easier for you if I upload to GitHub the disk images for G+DOS and TR-DOS (+3DOS support has a recent bug at the moment), plus the HTML manual, built from the latest sources, as an exceptional pre-release. Ok?
OK! I'll wait for the disk images of the new release.
Thanks for your help and for Solo Forth.
Carsten
Hello Johan,
On 26.04.17 19:56 PM, Johan Kotlinski wrote:
In case you are looking for Forths to benchmark, I would love to see https://github.com/jkotlinski/durexforth in the list :)
yes, I have DurexForth on my watchlist, I will try to find a C64 @ VCFe and do some Benchmarking (or better, motivate someone to do some benchmarking) :)
Thanks for reminding me about DurexForth.
Carsten
En/Je/On 2017-04-27 00:25, Carsten Strotmann escribió / skribis / wrote :
I will not have access to a Speccy until next years VCFe.
I'm not sure if you want to try Solo Forth on the real machine.
If so, you need a ZX Spectrum 128/+2 (not +2A) with either a Plus D interface or a Beta 128 interface... ZX Spectrum +3, which is easier to find than those disk interfaces, can not be used at the moment because of a recent bug in the +3DOS support.
Besides, you have to convert the disk image files to real disks... The Plus D and Beta 128 interfaces can use either 3'5" or 5'25" floppy disks, so the task is feasible. But the ZX Spectrum +3 used 3" floppy disks, so you can forget about it.
Things are much easier with an emulator!
OK! I'll wait for the disk images of the new release.
I've just released 0.14.0-pre.209 for this ocassion, with all disk images and an updated and improved manual. The manual includes a new section on how to run the tests and benchmarks included with the system. Hope this helps.
https://github.com/programandala-net/solo-forth
Any feedback will be appreciated.
I'll keep on working on the final 0.14.0.
-- Marcos Cruz http://programandala.net
Hello Marcos,
On 27.04.17 20:50 PM, Marcos Cruz wrote:
En/Je/On 2017-04-27 00:25, Carsten Strotmann escribió / skribis / wrote :
I will not have access to a Speccy until next years VCFe.
I'm not sure if you want to try Solo Forth on the real machine.
Well, that is the whole point of the event: getting the old machines our and not only display them, but work on them. And my job is to motivate people to do some work. And because Forth runs on almost any machine, we do it with Forth. So far I did not had a good Forth system for the Speccy in my toolbox, that is why I wanted to try out Solo Forth. But running on the old metal is a must have. I can prepare on an emulator, but on the VCFe event, it must run on the "real thing"(TM) :)
If so, you need a ZX Spectrum 128/+2 (not +2A) with either a Plus D interface or a Beta 128 interface... ZX Spectrum +3, which is easier to find than those disk interfaces, can not be used at the moment because of a recent bug in the +3DOS support.
Besides, you have to convert the disk image files to real disks... The Plus D and Beta 128 interfaces can use either 3'5" or 5'25" floppy disks, so the task is feasible. But the ZX Spectrum +3 used 3" floppy disks, so you can forget about it.
I'm pretty sure there will be people with 3" floppy disks on the show, and with means to get data in and out from these :)
Things are much easier with an emulator!
But boring.
OK! I'll wait for the disk images of the new release.
I've just released 0.14.0-pre.209 for this ocassion, with all disk images and an updated and improved manual. The manual includes a new section on how to run the tests and benchmarks included with the system. Hope this helps.
https://github.com/programandala-net/solo-forth
Any feedback will be appreciated.
I'll give it a try and let you know! Many thanks
Carsten
I may have my Forth ready to run in CP/M-80 in a few weeks.
On 28.04.17 10:03 PM, Lars Brinkhoff wrote:
I may have my Forth ready to run in CP/M-80 in a few weeks.
That's very cool. I'll give it a try once it is ready.
-- CS
If anyone can pick up an Ensoniq Mirage in time for VCF I can talk you through running Forth on that.
On 28.04.17 11:01 PM, Gordon JC Pearce wrote:
If anyone can pick up an Ensoniq Mirage in time for VCF I can talk you through running Forth on that.
I'm interested. Pickup where? I'm driving today.
-- CS
I haven't got one to pick up, what I mean is if you find someone near you with one :-)
Carsten & List,
Back in 1984 I bought a ROM of David Husband's ZX-Forth for my ZX81 - published as Skywave FORTH
http://www.worldofspectrum.org/infoseekid.cgi?id=1000370
The source is here from contributor "Moggy"
http://www.sinclairzxworld.com/viewtopic.php?t=459
As an electronic engineering student, it was immediately copied and distributed amongst friends.
The source code and hex are now out on the web, and it could possibly be adapted for the spectrum or any other Z80 machine - such as Spencer Owen's increasingly popular Z80 retro computer RC2014
https://www.tindie.com/products/Semachthemonkey/rc2014-mini-single-board-z80-computer-kit/?gclid=Cj0KEQjw0IvIBRDF0Yzq4qGE4IwBEiQATMQlMTbiz5WKsxtL9gD3dK6kn-nOjRLHzhiwBzFJ0b3eMdoaAmXi8P8HAQ
With the introduction of the Zilog eZ80 series of processors - we have a Z80 core, pipelined to do 1 instruction per clock at 50MHz - that's 50 times the speed of the old 4MHz Z80A. It has modern integrated peripherals including SPI ethernet and TCP-IP stack and a 24 bit addressing range.
I'm currently laying out an eZ80 board to be compatible with the RC2014.
regards
Ken
On 28 April 2017 at 10:48, Gordon JC Pearce [email protected] wrote:
I haven't got one to pick up, what I mean is if you find someone near you with one :-)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ForthHub/discussion/issues/44#issuecomment-297956811, or mute the thread https://github.com/notifications/unsubscribe-auth/AAuUP3py1MX_AswM4sXsBz6r47igOxDtks5r0bXmgaJpZM4NHKcE .
En/Je/On 2017-04-28 00:42, Carsten Strotmann escribió / skribis / wrote :
Well, that is the whole point of the event: getting the old machines our and not only display them, but work on them. And my job is to motivate people to do some work. And because Forth runs on almost any machine, we do it with Forth.
Good idea!
So far I did not had a good Forth system for the Speccy in my toolbox, that is why I wanted to try out Solo Forth.
That's exactly the very same reason I started developing Solo Forth: all ZX Spectrum Forth are old, slow, and very limited fig-Forth implementations.
The only remarkable exception is Lennart Benschop's Spectrum Forth-83, written c. 1988. It's very good. I think it deserves being included in the list.
When I discovered it, I wrote an article about it (in Spanish):
http://programandala.net/es.texto.2010.02.03.zx_spectrum_forth-83.html
It can run on ZX Spectrum 48 or 128, using tape or microdrives. It uses the banked memory of the 128 as RAM disk for Forth blocks.
At the end of the page, at the "Descargas" (=Downloads) section, you can download the system (both the original archive and my own distribution with additional files) and the manual, recently translated from Dutch to English by Lennart Benschop.
But anyway I wanted a modern and powerful Forth, useful for (cross-)development of complex projects for ZX Spectrum 128.
I'm pretty sure there will be people with 3" floppy disks on the show, and with means to get data in and out from these :)
Great! Unfortunately, the +3DOS version of Solo Forth has a bug at the moment, and the library disks (disks 1, 2 and 3, which contain the Forth blocks on the sectors) can not be used. You can boot the system from disk 0, but that's all: you can try only the words included in the kernel... Not very useful. This will be fixed in the final 0.14.0.
-- Marcos Cruz http://programandala.net
I can contribute some benchmarks taken from DurexForth v1.6.1, running on C64:
Integer Calc 37s Fib2 1m57s Nest 1million 17s Sieve/Prime 10s GCD1 60s GCD2 70s
With reservation for measurement errors, it seems to be comfortably faster than other C64 Forths on the list, which should not be a surprise given that it is subroutine threaded. Would be interesting to also have measurements from Blazin' Forth, which is very good and fast.
It does surprise me somewhat that the Z80 Forths are so much faster, but I can see that inbuilt 16-bit arithmetics and extra registers must help a lot.
There is a misconception that subroutine threading is generally faster than other threading techniques. The truth is that it depends on the instruction set architecture and whether or not in-line expansion of primitives is used.
Subroutine threading by itself can be slower than direct threading on some processors. Consider a processor that has a "JMP @IP+" instruction. On that processor, that instruction is "NEXT" and can be placed at the end of each code word. Thread-walking thus consists of one instruction that results in one memory access - reading the location to which the IP register points.
Conversely, with a subroutine-threaded Forth, each word reference is a
"JSR
However, subroutine threading does open the door to peephole optimization, wherein the bodies of code words can be expanded in-line, avoiding that JSR/push/pop overhead. If you do that, the system can be faster than direct threading. The tradeoff, though, is that such optimization makes decompilation and debugging very difficult. In my Forth work over the years, the convenience of having a great debugger has been more important than the speed advantages of deep optimization. In the few instances where I needed more speed, I have always been able to optimize a very small number of critical code paths by hand.
Mitch Bradley Author of Open Firmware, Sun Forth, CForth, Forthmacs, SPARCMon, ...
On 5/4/2017 8:12 AM, Johan Kotlinski wrote:
I can contribute some benchmarks taken from DurexForth v1.6.1, running on C64:
Integer Calc 37s Fib2 1m57s Nest 1million 17s Sieve/Prime 10s GCD1 60s GCD2 70s
With reservation for measurement errors, it seems to be comfortably faster than other C64 Forths on the list, which should not be a surprise given that it is subroutine threaded. Would be interesting to also have measurements from Blazin' Forth, which is very good and fast.
It does surprise me somewhat that the Z80 Forths are so much faster, but I can see that inbuilt 16-bit arithmetics and extra registers must help a lot for Forth.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ForthHub/discussion/issues/44#issuecomment-299265785, or mute the thread https://github.com/notifications/unsubscribe-auth/AEoszRgxnbz3KQpxWCtJY2m8JSlb29SHks5r2hTwgaJpZM4NHKcE.
Hi Mitch, I doubly agree :) The advantage of STC is specifically for the 6510 which is pretty inefficient for other threading models. I also agree it can be nicer with a simple Forth that is easy to debug, rather than a hyperoptimized one.
tors 4 maj 2017 kl. 21:11 skrev Mitch Bradley [email protected]:
There is a misconception that subroutine threading is generally faster than other threading techniques. The truth is that it depends on the instruction set architecture and whether or not in-line expansion of primitives is used.
Subroutine threading by itself can be slower than direct threading on some processors. Consider a processor that has a "JMP @IP+" instruction. On that processor, that instruction is "NEXT" and can be placed at the end of each code word. Thread-walking thus consists of one instruction that results in one memory access - reading the location to which the IP register points.
Conversely, with a subroutine-threaded Forth, each word reference is a "JSR
" instruction and each code word ends with "RTS". The "JSR " instruction is likely to be longer than the single thread word of a direct threaded Forth, because it needs space for both the JSR opcode and the reference, so it already causes more memory traffic. Then, when the JSR is executed, it pushes the return address on the stack, causing an additional memory access, and when the RTS is executed, there is yet another memory access to pop the return address from the stack. However, subroutine threading does open the door to peephole optimization, wherein the bodies of code words can be expanded in-line, avoiding that JSR/push/pop overhead. If you do that, the system can be faster than direct threading. The tradeoff, though, is that such optimization makes decompilation and debugging very difficult. In my Forth work over the years, the convenience of having a great debugger has been more important than the speed advantages of deep optimization. In the few instances where I needed more speed, I have always been able to optimize a very small number of critical code paths by hand.
Mitch Bradley Author of Open Firmware, Sun Forth, CForth, Forthmacs, SPARCMon, ...
On 5/4/2017 8:12 AM, Johan Kotlinski wrote:
I can contribute some benchmarks taken from DurexForth v1.6.1, running on C64:
Integer Calc 37s Fib2 1m57s Nest 1million 17s Sieve/Prime 10s GCD1 60s GCD2 70s
With reservation for measurement errors, it seems to be comfortably faster than other C64 Forths on the list, which should not be a surprise given that it is subroutine threaded. Would be interesting to also have measurements from Blazin' Forth, which is very good and fast.
It does surprise me somewhat that the Z80 Forths are so much faster, but I can see that inbuilt 16-bit arithmetics and extra registers must help a lot for Forth.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <https://github.com/ForthHub/discussion/issues/44#issuecomment-299265785 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AEoszRgxnbz3KQpxWCtJY2m8JSlb29SHks5r2hTwgaJpZM4NHKcE .
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/ForthHub/discussion/issues/44#issuecomment-299281093, or mute the thread https://github.com/notifications/unsubscribe-auth/ADG-OyuPtGGFlvjLUWbUcWBXBjJ3Sb1kks5r2iLHgaJpZM4NHKcE .
--
http://www.littlesounddj.com
Hi Johan,
thanks for the DurexForth Benchmarks.
I've added them to the benchmark page https://theultimatebenchmark.org/#sec-6
Impressive results.
I have Blazin Forth somewhere, I need to look.
Subroutine threading by itself can be slower than direct threading on some processors.
I'm assuming that on modern high-end processors, subroutine threading with paired JSR/RTS instructions will be much faster than hard-to-predict JSR @IP+ instructions.
It would be insteresting to compare the effect of different threading methods on different processors. In particular if you could use one single Forth implementation and just reconfigure the threading method. You would also have to have access to lots of hardware, or cycle-accurate simulators.
Hi,
On 05.05.17 08:13 PM, Lars Brinkhoff wrote:
Subroutine threading by itself can be slower than direct threading on some processors.
I'm /assuming/ that on modern high-end processors, subroutine threading with paired JSR/RTS instructions will be much faster than hard-to-predict JSR @IP+ instructions.
It would be insteresting to compare the effect of different threading methods on different processors. In particular if you could use one single Forth implementation and just reconfigure the threading method. You would also have to have access to lots of hardware, or cycle-accurate simulators.
as far as I know, GNU/Forth (aka gforth) does just that, compiles different threading implementations, does a Benchmark, and uses the fastest.
Also, have a look on some research from Anton Ertl:
The Structure and Performance of efficient Interpreters https://www.jilp.org/vol5/v5paper12.pdf
Threaded Code Variations and Optimizations http://www.complang.tuwien.ac.at/anton/euroforth/ef01/ertl01.pdf
A Look at Gforth Performance http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.563.1134&rep=rep1&type=pdf
-- Carsten
I could not keep myself from benchmarking Blazin' Forth for C64:
Integer 36s Fib2 3m15s Nest 1million 5m1s Sieve 20s Gcd2 2m27s
Good job making DurexForth fast!
Thank you Lars! I hope it somehow makes some useful addition to the wealth of Forth implementations already out there. Or maybe useful is the wrong word :)
En/Je/On 2017-05-04 11:12, Johan Kotlinski escribió / skribis / wrote :
I can contribute some benchmarks taken from DurexForth v1.6.1, running on C64:
Integer Calc 37s Fib2 1m57s Nest 1million 17s Sieve/Prime 10s GCD1 60s GCD2 70s
I adapted and included the benchmarks into the library of Solo Forth (https://github.com/programandala-net/solo-forth/blob/master/src/lib/meta.benchmark.ultimate.fs).
Results taken from Solo Forth 0.14.0-pre.217, running on ZX Spectrum 128 (emulated by Fuse 1.3.5):
Integer Calc 25.90 s Fib1 N/A Fib2 88.24 s Nest 1million 163.64 s Nest 32million 5364.64 s MemMove 10.44 s CountBits 177.64 s (1) Sieve/Prime 8.40 s GCD1 43.32 s GCD2 55.22 s 6502emu 1887.68 s (2)
Notes/Doubts:
-
It seems "8192 do" is a typo. I used "8192 0 do".
-
I changed "&6502" to "#6502", i.e. decimal 6502. But maybe it's octal.
Can someone confirm those issues?
Besides, the difference between DurexForth (STC) and Solo Forth (DTC) in the Nest 1million is surprising.
It does surprise me somewhat that the Z80 Forths are so much faster, but I can see that inbuilt 16-bit arithmetics and extra registers must help a lot for Forth.
Indeed. A double set of registers, most of them usable as 8-bit or
16-bit, is very useful. I keep the address of the next
entry point in
the IX register. This makes the nesting faster. AFAIK, keeping TOS in a
Z80 register speeds things too, but that code change is not trivial I
didn't tried it yet.
-- Marcos Cruz http://programandala.net
On 5/8/2017 8:20 AM, Marcos Cruz wrote:
En/Je/On 2017-05-04 11:12, Johan Kotlinski escribió / skribis / wrote :
Indeed. A double set of registers, most of them usable as 8-bit or 16-bit, is very useful. I keep the address of the
next
entry point in the IX register. This makes the nesting faster. AFAIK, keeping TOS in a Z80 register speeds things too, but that code change is not trivial I didn't tried it yet.
I don't remember for certain because it has been so long, but I seem to recall that putting TOS in a register on Z80 resulted in a 10% speedup on an ITC Forth, probably a variant of F83. After that you have pretty much used up the registers. I think I tried using the shadow registers for e.g. UP, but my memory is vague.
Mitch said: There is a misconception that subroutine threading is generally faster than other threading techniques. The truth is that it depends on the instruction set architecture and whether or not in-line expansion of primitives is used.
It's true that subroutine threading may be slower than other techniques for the threading only. However, it ignores the effect of nesting, i.e. colon definitions. Once you include the nest/unnest overhead and try to keep words small, the picture changes drastically. On our (MPE) tests over 20 years ago on several architectures, mostly 32 bit, straight STC code averaged 2.2 times faster than DTC code.
Simple inlining and peepholing will get you another factor of two. Heavy duty code generation gets you another factor of two or so. Using the MPE benchmark code showed that VFX is about ten times faster than threaded code. It's certainly 12 times faster than Win32Forth when we last tested it.
The code size expansion expected by many just does not happen. Yes, calls are bigger, but phrases such as LIT + @ (which occurs a lot) reduce from three tokens to one instruction on most competent CPUs. Our reference was a 68000 embedded application converted from DTC to NCC (Native Code Compiled). The 256kb application was 1% smaller under NCC.
Compilation is always surprising. You have to measure everything.
Yes, a code generator can be a complex piece of code. Yes, you have to use different techniques to implement a traditional Forth debugger. The question is whether a performance gain of 10:1 or more is worth having. For us, being able to write a full TCP/IP stack or USB stack in high-level Forth makes it worth-while. I haven't written an ARM interrupt handler in assembler for over a decade, except to prove that I can still do it.
@spelc , how well does the VFX debugger work? I have never had the luxury of a proper step debugger in Forth before. I wonder if it could help improve my productivity.
Roger: That's a question that gets discussed here and at EuroForth every year or two. The VFX for Windows debugger is a 15+ year old monster that was designed to compete with Visual C of the period, not least in the amount of screen space that you can waste with it. Because of the optimisation level in VFX, there are occasions when the source level (bouncing ball) display goes backwards. The debugger is non-invasive and I have used it a few times for heavy-duty debugging when I have completely lost the plot.
Our users hold on to it tightly for a few weeks until they have to learned to debug in Forth. Then they put it aside.
If I had my time again, I would write an old-style invasive debugger that inserts a word DEBUG-STEP between all words. This has several advantages:
- the code generator is almost totally bypassed,
- The PC never goes backwards,
- It's relatively easy to extract information about the word from it,
- If DEFERred it's easy for the user to extend.