forth2012-test-suite blocktest.fth: TUF2-1 Needs to use BUFFER instead of BLOCK

Feb 27 '21 13:02 frenchie68

Hi frenchle68, My quick response is that I think you are correct. This would also make TUF2-1 consistent with the following TUF2-2 test. However, before committing to this change, I want to set up a test system on my machine and try it out. Thanks, Steve

Feb 27 '21 15:02 steverpalmer

Hmm. Now I've changed my mind and think that either BLOCK or BUFFER will work...

When I wrote my comment above, I was thinking that BLOCK always read the data from the mass-storage device - in which case the test should fail with BLOCK and pass with BUFFER. However, rerunning the tests with both BLOCK and BUFFER gives me exactly the same result - both pass. So I went back to the standard I am working from (Forth 2012 RC3 28th September 2014). The definition of BLOCK (7.6.1.0800) states:

If block u is already in a block buffer, a-addr is the address of that block buffer.

It goes on to indicate that the block is only transferred from the mass-storage device if block u is not already in memory. The definition of BUFFER (7.6.1.0820) has the same statement as BLOCK in the case when block u is already in the block buffer.

Test Description related to TUF2-1:

Randomly picks 2 distinct block numbers in the test range;
Fills both blocks with random data, and writes this out to the mass storage device;
Regenerates both blocks with more random data (but no UPDATE);
(TUF2-1) UPDATEs only the first of the two blocks;
FLUSHes any updated blocks to the mass-storage device;
Rereads both blocks from the mass storage device;
Finally checks the block hashes to ensure that the first block contains the regenerated random data while the second block contains the initial random data (so not updated).

Since both blocks are in the block buffer at step 4, BLOCK and BUFFER should do exactly the same thing. Therefore, I disagree that "TUF2-1 Needs to use BUFFER instead of BLOCK". (Sorry)

Unfortunately, I am not working with the final version of the Forth 2012 standard, so this definition could have changed and I could be wrong. Please let me know if this is the case.

All that being said, there may be a reason why BUFFER might be preferred to BLOCK, but I don't know why. Despite what I said in my first response, I think it is better to test BLOCK in TUF2-1 and BUFFER in TUF2-2.

If you are still having problems with this test, please let me know.

Cheers, Steve

Feb 28 '21 10:02 steverpalmer

Thanks for handling this issue Steve. A long time since we've communicated I hope all is well with you.

Regarding the specification of BLOCK and Forth 2012. The Forth 2012 standard hasn't changed And is the same as the original ANS spec. I don't think it has changed in subsequent Forth200X changes. Eventually Forth2012 will be superseded by Forth 20YY where YY is a future year (probably 99 at the rate they are going!).

I'll leave you to close the issue when you're happy it is resolved.

Gerry

Feb 28 '21 15:02 gerryjackson

Hi Gerry, Thanks for that extra information about the standards. To progress this issue I will need some further information from frenchle68 (or anyone) to indicate either why "TUF2-1 Needs to use BUFFER instead of BLOCK" or to explain why I am wrong (entirely possible), or maybe even for frenchle68 to accept that there is no need to change. I propose to leave the issue open for about a month for further feedback, after which I will close it on the basis of "no change required". Steve

Mar 01 '21 10:03 steverpalmer

BLOCK and BUFFER are not straight equivalent beasts, despite what some implementations would make you believe. BLOCK will force a read from mass storage, whereas BUFFER will not.

Mar 01 '21 19:03 frenchie68

Hi Francois,

Thank you for your quick response.

Before any disagreements, there are many things that (I believe) we do agree on:

If BUFFER replaces BLOCK in TUF2-1, the test should pass;
BLOCK and BUFFER are not equivalent;
If, as you say, "BLOCK will force a read from mass storage", then BLOCK in TUF2-1 is wrong and needs to be replaced by BUFFER;
The tests should not be based on any specific implementation, but on the FORTH 2012 language standard.

The first thing then is to make sure we are working from the same definition from the language standard for BLOCK.

I quote here from Forth 2012 RC3 28th September 2014, but believe that the formal standard has not changed.

7.6.1.0800 BLOCK ( u -- a-addr ) [1.1] a-addr is the address of the first character of the block buffer assigned to mass-storage block u. An ambiguous condition exists if u is not an available block number. [1.2] If block u is already in a block buffer, a-addr is the address of that block buffer. [1.3] If block u is not already in memory and there is an unassigned block buffer, transfer block u from mass storage to an unassigned block buffer. a-addr is the address of that block buffer. [1.4] If block u is not already in memory and there are no unassigned block buffers, unassign a block buffer. If the block in that buffer has been UPDATEd, transfer the block to mass storage and transfer block u from mass storage into that buffer. a-addr is the address of that block buffer. [1.5] At the conclusion of the operation, the block buffer pointed to by a-addr is the current block buffer and is assigned to u.

(I've added the paragraph numbers in '[' ']' in the hope that it will help our discussion.)

While we agree that BUFFER is not the same as BLOCK, I think it may be useful to include its definition also:

7.6.1.0820 BUFFER ( u -- a-addr ) [2.1] a-addr is the address of the first character of the block buffer assigned to block u. The contents of the block are unspecified. An ambiguous condition exists if u is not an available block number. [2.2] If block u is already in a block buffer, a-addr is the address of that block buffer. [2.3] If block u is not already in memory and there is an unassigned buffer, a-addr is the address of that block buffer. [2.4] If block u is not already in memory and there are no unassigned block buffers, unassign a block buffer. If the block in that buffer has been UPDATEd, transfer the block to mass storage. a-addr is the address of that block buffer. [2.5] At the conclusion of the operation, the block buffer pointed to by a-addr is the current block buffer and is assigned to u.

Both definitions include 3 cases covered in paragraphs [1.2], [1.3] and [1.4] (and correspondingly in [2.2], [2.3] and [2.4]). Test TUF_2-1 is an example of the case in paragraph [1.2] since the block is already in a block buffer.

Cases [1.3] and [1.4] explicitly cover "transfer of block u from mass storage", but no transfer is mentioned in case [1.2]. Perhaps it is my mistake to interpret the absence of such an explicit statement as indicating that no transfer should take place. Case [1.2] does not explicitly exclude a transfer taking place, but then neither does anything in BUFFER.

Another way to consider this is to consider the users perspective - what would someone using BLOCK, BUFFER and others expect to happen. Consider the following:

20 BLOCK DUP C@ 1+ SWAP C! UPDATE 20 BLOCK

Does the final BLOCK force the read, so removing the update in the block buffer? Perhaps the UPDATE should force the block buffer to be written out before being reread? I am not sure I understand what should happen when "BLOCK will force a read from mass storage".

Therefore, the way I read both BLOCK and BUFFER definitions says that no transfer should take place if the block is already in the block buffer. Indeed, reading the whole of section 7 "The optional Block word set" in the standard, I can not find anything requiring that "BLOCK will force a read from mass storage".

On the other hand, it could be argued that if there is any ambiguity, then it may be better to use BUFFER in preference to BLOCK to avoid any problem. My (not very strong) argument in favour of using BLOCK in TUF2-1 is only to maximise test coverage since BUFFER is used in a similar test TUF2-2. Also, if there is some aspect of the behaviours of these words that I have missed in the language standard, we may want to extend the tests to cover the behaviour; that is to add a tests that "BLOCK will force a read from mass storage".

Francois: I'd be grateful if you could provide evidence to support that "BLOCK will force a read from mass storage".

Gerry: Do you have any suggestions on how to resolve differences in interpretation of the language standard? Is there anyone who we can ask for a definitive answer? Perhaps BUFFER should be preferred in TUF2-1 as a consensus position?

Mar 02 '21 09:03 steverpalmer

A web search revealed this discussion: http://forum.6502.org/viewtopic.php?f=9&t=3135. Although this is not exactly about the same thing, it does show that we are not the only folk trying to understand BUFFER & BLOCK behaviour.

Mar 02 '21 10:03 steverpalmer

Hi again Francois, I thought I might be better able to explain my position if I invented a new forth word with a trivial meaning:

DEVICE_READ_COUNT  ( -- n )
\ a rolling count of blocks read from the mass storage device

Given this, I could add the following tests:

\ 1. When the block buffer is all unassigned, BLOCK must read from mass storage
T{ FLUSH DEVICE_READ_COUNT 20 BLOCK DROP DEVICE_READ_COUNT - 0< -> TRUE }T

\ 2 ... but if the block is already in memory, it will not be re-read
T{ 20 BLOCK DROP DEVICE_READ_COUNT 20 BLOCK DROP DEVICE_READ_COUNT = -> TRUE }T

\ 3 ... even if the block in memory has been changed
T{ 20 BLOCK DUP C@ 1+ SWAP C! DEVICE_READ_COUNT 20 BLOCK DROP DEVICE_READ_COUNT = -> TRUE }T

\ 4 ... even if the block in memory is flagged as updated
T{ 20 BLOCK DROP UPDATE DEVICE_READ_COUNT 20 BLOCK DROP DEVICE_READ_COUNT = -> TRUE }T

\ 5 ... and even if the block is only assigned through a BUFFER command
T{ 20 BUFFER DROP DEVICE_READ_COUNT 20 BLOCK DROP DEVICE_READ_COUNT = -> TRUE }T

BLOCK does not force a read from the mass storage device if the block is already assigned in memory. This is what is happening in TUF2-1, and why either BLOCK or BUFFER can be used in this test.

Mar 02 '21 12:03 steverpalmer

Hi Steve,

I do apologize for having poorly stated my position. I totally agree with your 4 point statement (points of agreements) and reference to the ANSI specification for BLOCK and BUFFER. I think I need to go back to the drawing board and collect experimental evidence.

As you probably know the block subsystem specs are as old as Forth (40 years+), I do not think they have changed substantially since they were introduced. I based my own implementation on the description I got from the "Forth Programmer's Handbook" By Rather (chairwoman of the ANSI committee) and Conklin. So, that is my gospel and I am sure that it matches what the ANSI standard says.

I must add that I have only two resident buffers that I am running your code from blocks as well. So I had to introduce compiled implementations of a number of your tests. I will get back to you when I have more facts supporting my original statement.

Mar 02 '21 17:03 frenchie68

Yes - I think the BLOCK words are set in stone, perhaps due to erosion rather than merit.

I've just tweaked my system to use two resident buffers and the tests passed. This is not to say that my system is a gold standard, but only that I've not been able to reproduce a problem. An early version of the test did fail on systems with only one resident buffer (see issue #7). I added a slightly nasty check that there were two buffers - see comment on line 321 of blocktest.fth.

However, I am not running the tests from blocks. I don't know of anything that would prevent the test running from the block store, but this is not something I've tried. It would not surprise me to find that more than 2 buffers where needed in such a situation. I'm very interested in your experience.

I'll wait for more information from you.

Mar 03 '21 09:03 steverpalmer

Gerry: Do you have any suggestions on how to resolve differences in interpretation of the language standard? Is there anyone who we can ask for a definitive answer? Perhaps BUFFER should be preferred in TUF2-1 as a consensus position?

Hi Steve. You can ask a question on either

comp.lang.forth (Google Groups which is awful, I use Thunderbird to access it) or
on the Forth 2012 website that contains the latest specification. Got to the spec for BLOCK or BUFFER and select the 'Contribute' to enter the questions.

With comp.lang.forth you may open yourself to the abusive troll who resides there but just ignore him like most do. I believe members of the committee monitor both places for questions and you should get definitive ruling (if there is one of course!). I believe comp.lang.forth has more readers, the other has people more interested in the standard.

Incidentally I'm sorry but I can be no help on this issue as I never use the Block word set, tending to regard blocks as a relic of a bygone age. Thanks for continuing to deal with it.

Gerry

Mar 03 '21 10:03 gerryjackson

Hi Steve/Gerry,

Blocks are not from a bygone age for folks like me doing retro-computing. As a matter of fact, the ANSI standard specifies the file access word set as optional--still nowadays.

Anyhow, I think I have a better understanding of why that test fails in my implementation. Buffers are managed using a 'flags' field. 3 bits are used:

BINUSE: the buffer has been assigned a corresponding block number. BMAPPED: the buffer block has beenread from mass storage. BDIRTY: the buffer has been marked FLUSHable.

If what you can see below, the flags field is summarized using U -> USED, M ->MAPPED and D -> DIRTY.

empty-buffers OK bsa --- #457 --- #139 OK FIND TUF2-1 2RND-TEST-BLOCKS TUF2 .s 325 1793 3991 3573 325 1793 OK bsa UM- #132 UM- #137 OK

Let's run it by the minute, I have two resident buffers and I am running your suite from blocks.

TUF2-1A gets called:

: TUF2-1A T{ ['] TUF2-1 2RND-TEST-BLOCKS \ run test procedure TUF2 SWAP DROP SWAP DROP 2= -> TRUE }T ;

2RND-TEST-BLOCKS 2 random test blocks are selected from [FIRST-TEST-BLOCK LIMIT-TEST-BLOCK[ TUF2 \ Those two block numbers are different and in the valid test range. blk1 is randomly initialized (from BUFFER context) and updated U-D blk2 is randomly initialized (from BUFFER context) and updated U-D updated buffers are flushed, remaining valid in memory, the dirty bit is cleared, yet the M (mapped bit is not set)

blk1 is randomly init'd (from BUFFER ctx), U--
blk2 is randomly init'd (from BUFFER ctx). U--

The primitive passed as an argument is executed (TUF2-1 updates block1).
If TUF2-1 calls BLOCK, blk1 is read from the mass storage device,  overwriting
blk1's contents.
blk1 UMD
blk2 U--

Updated buffers are flushed (block1 gets written to mass storage)

 blk1 UM-
 blk2 U--

We return to the original code:

: TUF2-1A T{ ['] TUF2-1 2RND-TEST-BLOCKS \ run test procedure
TUF2 SWAP DROP SWAP DROP 2= -> TRUE }T ;

And the final -> assertion fails because hash1'' == hash1' is false.

There are two ways to get around this (from my standpoint):

mark flushed blocks as mapped (implementation fix).
have TUF2-1 use BUFFER instead of BLOCK (test suite fix).

Just let me know how you prefer to handle this.

Also, there seems to be some junk code at the end of blocktest.fth:

BLOCK-ERRORS SET-ERROR-COUNT

That does not compute!

Thanks, by the way for having written this block test suite!

All the best.

    Francois

Mar 03 '21 17:03 frenchie68

Hi Steve,

I tried Avenue A (implementation change) but that causes

: TUF-D T{ RND-TEST-BLOCK \ blk
0 OVER PREPARE-RND-BLOCK \ blk hash
UPDATE FLUSH \ blk hash
OVER 0 SWAP PREPARE-RND-BLOCK DROP \ blk hash
FLUSH ( with no preliminary UPDATE) \ blk hash
SWAP BLOCK 1024 ELF-HASH
= -> TRUE }T ; TUF-D

to fail. I think, like most physicists do, that there are some virtues to some form of symmetry. I will revert to my original code implementation and the use of BUFFER in TUF2-1.

Mar 03 '21 18:03 frenchie68

Thank you Francois. I now understand why the test fails and what is meant by "BLOCK will force a read from mass storage". (For Your Information, my implementation sounds similar, but has only the BINUSE and BDIRTY flags. It does not have the BMAPPED flag, so does not force the read from mass storage.) If I've understood correctly, the the result of FLUSH 20 DUP BUFFER BLOCK would force a read from mass storage block 20 into the buffer.

However, my question is now Is such a "Forced read" implementation compliant with the Forth-2012 specification? In particular, the definition of BLOCK paragraphs [1.3] and [1.4] mentions "transfer block u from mass storage" in the case when "block u is not already in memory", but paragraph [1.2] does not mention any transfer when "block u is already in a block buffer". Therefore, I would not expect FLUSH 20 DUP BUFFER BLOCK to do a transfer (in BLOCK) since the buffer is already assigned (in BUFFER).

Please understand me: my question is not concerned with the merits of the implementation. It may well be more useful to have functionality that forces a read. However, my reading of the standard says that such functionality is not compliant.

Francois: I'd be happy to take this question to comp.lang.forth or ask for guidance from the Forth 2012 website if you disagree.

I agree that Avenue A (marking flushed buffers as mapped) will not work since FLUSH is required to leave all block buffers unassigned.

Regarding BLOCK-ERRORS SET-ERROR-COUNT, the full test-suite includes errorreport.fth which accumulates errors from each word set tested for report at the end of a full run.

Mar 04 '21 11:03 steverpalmer

Hi Steve,

When you initialize a block acquired via BUFFER directly, you can write it to mass storage but, from the point of view of the Forth system, that block is not yet in memory, so BLOCK will definitely read from mass storage.

The interesting thing with BUFFER is that you can do repeated direct block write operations without having to read the blocks beforehand. This is useful when you want to initialize all blocks of a device.

I just grabbed my copy of the "Forth Programmer's Handbook" and it turns out you are right about the ANSI FLUSH. It is supposed to deassign all buffers. In 79-STANDARD (my primary allegiance), FLUSH and SAVE-BUFFERS are synonyms. So the ANSI committee changed some very subtle aspects of the block subsystem, what a shame! The semantics I have for FLUSH are exactly that of the ANSI SAVE-BUFFERS.

Back to implementing numeric literal base prefixes (a VolksForth concept).

All the best.

    Francois

Mar 04 '21 17:03 frenchie68

A side note while I'm at it: if you want authoritative third party confirmation, I recommend (like Gerry said) you do not waste your time on comp.lang.forth. Better go to the Forth2012 web site, there is not so much noise there.

Also you might consider joining the Forth2020 FB group. We need more valuable people there. It was nice discussing fine implementation points with a fellow implementer!

Cheers.

    Francois

Mar 04 '21 17:03 frenchie68

I've found a copy of the FORTH - 79 standard (October 1980), and it's definition of BLOCK is (I think) different to ANS FORTH. From the FORTH - 79 standard:

[3.1] Leave the address of the first byte in block n. [3.2] If the block is not already in memory, it is transferred from mass storage into whichever memory buffer has been least recently accessed. [3.3] If the block occupying that buffer has been UPDATEd (i.e. modified), it is rewritten onto mass storage before block n is read into the buffer. [3.4] n is an unsigned number. [3.5] If correct mass storage read or write [is] not possible, an error condition exists. [3.6] Only data within the latest block referenced by BLOCK is valid by byte address, but to sharing of the block buffers

Again, I've added sentence numbers in "[ ]" to help with this explanation.

Sentence [3.1] maps to paragraph [1.1] in ANS Forth2012 (quoted earlier)
Sentence [3.2] maps to paragraph [1.3]
Sentence [3.3] maps to paragraph [1.4]
Sentence [3.4] is handled by the use of u in the description rather than n
Sentence [3.5] maps to section 7.4.1.2 "Ambiguous conditions"
Sentence [3.6] maps to section 7.3.2 "Block buffer regions"

However, there is nothing in the FORTH 79 standard that covers paragraph [1.2] of ANS Forth - the definition of BLOCK in the case that the block is already in a block buffer. I believe that an implementation of BLOCK that forces a read is compliant with the FORTH-79 standards (and the FORTH-83 standard), but not with the ANS standard.

I've put in a request for clarification to the Forth2012 web site asking whether BLOCK is allowed/required to transfer from the mass storage if the block is already in a block buffer.

Mar 05 '21 13:03 steverpalmer

Hi Steve,

I really appreciate your thoroughness with respect to this matter. However, even if you think the block is in memory, the Forth block subsystem does not, since (I know this is implementation related) the BMAPPD bit is not set. I believe this the central misunderstanding between the two of us.

Honestly, I do not think you can manage properly a buffer's attribute field with only two bits of information: BINUSE and BDIRTY. You also need to know whether the data was actually read from mass storage. You could argue that writing from an assigned buffer with no IO error would make that buffer current (mapped). I would not disagree with that position. Yet you would need an extra bit of information to record that fact.

In my implementation BLOCK calls BUFFER in order to retrieve a base buffer address. BUFFER will scan the resident buffer set for the block number specified as an input parameter. If that block is not currently assigned, it will select the least recently used buffer and will flush its content to mass storage if is marked as dirty (write back if dirty).

Then control is returned to the BLOCK code. BLOCK will check the flags attribute field, particularly the BMAPPD bit. If that bit is not set, the block will be read from mass storage. Again, just because you initialized a buffer contents does mean that the buffer content is in memory (BMAPPD is asserted). BUFFER and BLOCK operate at two different levels.

In essence, BUFFER is a vehicle for writing blocks without having to read them first and BLOCK is a vehicle for reading from mass storage. If you get a chance to get a copy of Conklin/Rather's "Forth Programmer's Handbook", I highly recommend you read Appendix C: "Blocks for Disk Storage" page 243. This is, by far, the clearest explanation I have come across about the topic at hand--and it is guaranteed to be ANSI compliant material!

All the best.

    Francois

Mar 05 '21 18:03 frenchie68

OK, you have been factual and I ought to be as well. Wrt the ANSI standard: "not being already in memory" is where I think you just cannot maintain that kind of information with just two bits of information. If you look closely at the specification for BUFFER, you will see that nowhere is it specified that the block is to be read from mass storage. I can only speculate that your implementation actually reads from mass storage when BUFFER is called. Otherwise you would not be able to maintain enough state information with only two bits.

Mar 05 '21 20:03 frenchie68

GNU Forth certainly entertains the assumption that a reference to BUFFER will have that block read from mass storage but this is in no way required by the ANSI standard. I think what the standard does not say is almost as important as what it says.

This whole API was originally designed in times when mass storage was--at best--implemented on floppy disks. So performance concerns were paramount. The ability (via a reference to BUFFER) to write a block without having to read from the device was a concern, back in those days. Nowadays, things are a little bit different, obviously. Yet a compliant implementation should not require BUFFER to read from mass storage.

Also the ANSI spec still mentions the file access method as optional and lots of people tend to assume an underlying filesystem. I am into retro-computing (hardcore ROMable embedded software) and I do not provide such support.

Food for thought...

    Francois

Mar 06 '21 11:03 frenchie68

I've got some catching up to do here, so bare with me:

I just grabbed my copy of the "Forth Programmer's Handbook" and it turns out you are right about the ANSI FLUSH. It is supposed to deassign all buffers. In 79-STANDARD (my primary allegiance), FLUSH and SAVE-BUFFERS are synonyms. So the ANSI committee changed some very subtle aspects of the block subsystem, what a shame! The semantics I have for FLUSH are exactly that of the ANSI SAVE-BUFFERS.

In fact, the change was introduced in Forth-83 which states:

FLUSH -- M,83
Performs the function of SAVE-BUFFERS then unassigns all block buffers. (This may be useful for mounting or changing mass storage media).

I think that ANS Forth (and Forth2012) has carried the Forth-83 definition forward unchanged.

(I offer this only for completeness of our discussion, as it is a side issue from the substance of the issue raised.)

Mar 07 '21 09:03 steverpalmer

Now to an area we agree:

GNU Forth certainly entertains the assumption that a reference to BUFFER will have that block read from mass storage but this is in no way required by the ANSI standard. I think what the standard does not say is almost as important as what it says.

This whole API was originally designed in times when mass storage was--at best--implemented on floppy disks. So performance concerns were paramount. The ability (via a reference to BUFFER) to write a block without having to read from the device was a concern, back in those days. Nowadays, things are a little bit different, obviously. Yet a compliant implementation should not require BUFFER to read from mass storage.

I completely agree with what you say.

GForth (0.7.3) contains

: buffer ( u -- a-addr ) \ block
    \G If a block buffer is assigned for block @i{u}, return its
    \G start address, @i{a-addr}. Otherwise, assign a block buffer
    \G for block @i{u} (if the assigned block buffer has been
    \G @code{update}d, transfer the contents to mass storage) and
    \G return its start address, @i{a-addr}.  The subtle difference
    \G between @code{buffer} and @code{block} mean that you should
    \G only use @code{buffer} if you don't care about the previous
    \G contents of block @i{u}. In Gforth, this simply calls
    \G @code{block}.
    \ reading in the block is unnecessary, but simpler
    block ;

This is compliant with the Forth2012 standard, though it is not compliant the original Forth-79 which stated:

The block is not read from mass storage

Again, for completeness, it was Forth-83 which changed the definition of BUFFER, not ANS FORTH or FORTH2012. From Forth-83:

BUFFER u -- addr M,83
Assigns a block buffer to block u. addr is the address of the first byte of the block within its buffer. This function is fully specified by the definition for BLOCK except that if the block is not already in memory it might not be transferred from mass storage. The contents of the block buffer assigned to block u by BUFFER are unspecified.

However, there is a corollary to this point: If (as allowed in Forth-83, ANS Forth and Forth2012), BUFFER may simply call BLOCK, then (paraphrasing the title of this issue) "TUF2-1 [using] BUFFER instead of BLOCK" may not solve the original problem. If you imagine a Forth2012 implementation in which BUFFER merely calls BLOCK then replacing BLOCK by BUFFER will change nothing.

Mar 07 '21 09:03 steverpalmer

Francois states:

Wrt the ANSI standard: "not being already in memory" is where I think you just cannot maintain that kind of information with just two bits of information.

Here I offer a possible implementation based on just 1 flag: BDIRTY used to indicate that the buffer needs to be written back to storage (set by UPDATE). In addition, I will choose 0 to be an Invalid block number.

Let block buffers be held in memory, with each block buffer being:

1 cell (assume 32 bits) holding the block number the buffer is assigned to, or 0 if it is unassigned. Also, the BDIRTY flag is the highest bit on this block number. This implementation allows for possibly 2^31 - 1 blocks, or ~2GB of mass storage.)
followed by 1024 bytes holding the contents of the buffer.

Allow the following "helper" words: find-block ( u -- a-addr | 0 ) Searches the block buffers in store looking for a buffer with a matching assigned block number and either returns its address if it is present, or 0 if it is not. It never reads or writes to the mass storage device. find-free ( -- a-addr | 0 ) Searches for a block buffer with a 0 assigned block number and either returns its address if one is found, or 0 if it is not. It never reads or writes to the mass storage device. free-a-buffer ( -- a-addr ) Returns the address of a buffer for use. If the buffer had previously been marked as dirty with the BDIRTY flag bit, then the buffer is written to the assigned block in mass storage. It will write 0 to the assigned block number of the buffer. (BLOCK-RW) ( c-addr u r/w -- fail) The primitive which reads or writes 1024 bytes at c-addr to/from the mass storage device block u; FALSE to read, TRUE to write. The fail flag is FALSE on success, and TRUE if the read failed for any reason.

Given these, consider the following implementation

: BUFFER  ( u -- a-addr )
    DUP find-block ?DUP 0= IF  \ is block already in memory?
        find-free ?DUP 0= IF  \ is an unused block available?
            free-a-buffer  \ force a block to be free
        THEN
        SWAP OVER !  \ assign the buffer to the block
    ELSE  \  block already in memory, ...
        NIP  \ so just tidy up
    THEN
    DUP current-buffer !  \ remember the buffer just referenced for subsequent UPDATE
    CELL+  \ move to the address of the buffer contents
;

: BLOCK  ( u -- a-addr )
    DUP find-block ?DUP 0= IF  \ is block already in memory?
        find-free ?DUP 0= IF  \ is an unused block available?
            free-a-buffer  \ force a block to be free
        THEN
        2DUP !  \ assign the buffer to the block
        DUP CELL+ ROT FALSE (BLOCK-RW)  \ read the mass storage into the buffer
        IF -33 THROW THEN  \ error on read failure
    ELSE  \ block already in memory, ...
        NIP  \ so just tidy up
    THEN
    DUP current-buffer !  \ remember the buffer just referenced for subsequent UPDATE
    CELL+  \ move to the address of the buffer contents
;

I claim that this is fully compliant with the FORTH2012 standard. BLOCK does not use BUFFER and BUFFER does not use BLOCK. They are definitions at the same level. However, please note that BLOCK only reads the mass storage device if the block is not found in the buffer store. It does not use another BMAPPD flag.

Francois: Unfortunately, I haven't for a copy of Conklin/Rather's "Forth Programmer's Handbook", but regardless, if you think that this implementation is not compliant with the Forth2012 standards, please can you identify the clause of the standard were it is not compliant.

Mar 07 '21 12:03 steverpalmer

If an (almost) implementation is not a way this problem can be understood, then how about some test code...

Consider the following:

( Prepare a buffer in memory with some contents )
: PREPARE_BUFFER  ( blk c-addr u -- )
    ROT BUFFER DUP 1024 BL FILL
    SWAP 1024 MIN CMOVE
;

( Test whether BLOCK forces a read from mass storage )
: BLOCK_FORCED_READ? ( blk -- )
    EMPTY-BUFFERS                                          \ Establish known starting condition
    DUP S" MASS STORAGE READ" PREPARE_BUFFER
    UPDATE SAVE-BUFFERS EMPTY-BUFFERS                      \ Prepare mass storage
    DUP S" MASS STORAGE NOT READ" PREPARE_BUFFER           \ Prepare same buffer again, but with distinct contents - do not write!
    DUP BLOCK DROP                                         \ Does BLOCK read from mass storage? ...
    LIST ;

20 BLOCK_FORCED_READ?

If I have correctly understood the "forced read" implementation, then the block that is listed would have "MASS STORAGE READ" in its top line. (Francois: I'd be grateful if you would confirm this.) However, my reading of the standard says that the block that is listed should have "MASS STORAGE NOT READ" on its top line. This is the result both of my implementation and of Gforth. (Again, I'm not saying that these are gold standard implementations, just that some folk believe them to be compliant with the Forth2012 standard.) A third possibility is that the Forth2012 standard allows both outputs to be compliant with the standard. It is precisely this question that I posed to the Forth2012 standard for clarification.

Mar 07 '21 12:03 steverpalmer

I have just noticed an answer to the clarification I asked for in Forth 2012 standard. This question I asked was "Can BLOCK transfer from mass storage in the case when block u is already in a block buffer?"

From AntonErtl (2021-03-07 11:03:41)

Concerning the question in the title, IMO BLOCK must not do that. Concerning your example: If we assume that nothing between the second call to BUFFER and the call to BLOCK invalidates the buffer, it should produce "MASS STORAGE NOT READ".

Anton continues to point out some errors in my example that could allow something to invalidate the buffer between the second call to BUFFER and the call to BLOCK. I fully accept my error, but I do not feel that they change the fundamental point; Once a buffer has been assigned by BUFFER, it most not be overwritten by a subsequent use of BLOCK (assuming that nothing happens in between to invalidate the assignment).

Therefore, the use of BLOCK in TUF2-1 is compliant with the Forth2012 standard.

The issue kindly raised by Francois is leading me to think about how the tests might be better expressed to make them more diagnostic - so that a test failure predicts a specific error in an implementation - but I believe that the tests are valid as they stand.

Mar 07 '21 13:03 steverpalmer

Anton Ertl is behind GNU Forth and, as I mentioned it previously in this thread, GNU Forth reads from mass storage when BUFFER is called, which is something that the standard does not require in any way. It also does not invalidate an implementation that would read from mass storage either. That is the sort of ambiguity that makes the ANSI standard tricky to handle from a testing perspective.

In any case, I think your test suite is useful and I will include a slightly modified version of it in my ZForth79 distribution. We do not have to agree on everything.

All the best.

    Francois

Mar 09 '21 17:03 frenchie68

I'm going to close this issue as a "Won't Fix". The tests as written are consistent with the Forth 2012 standard, specifically:

they test only what the standard requires;
the rely only of the described behaviour of the words in the standard.

I think this position is also backed-up by the response from Forth Standards Organisation.

Keeping this Issue open for longer serves no purpose.

Closing. Steve

Aug 02 '22 13:08 steverpalmer

try to close...

Nov 27 '22 09:11 steverpalmer

forth2012-test-suite forth2012-test-suite copied to clipboard

blocktest.fth: TUF2-1 Needs to use BUFFER instead of BLOCK

forth2012-test-suite
forth2012-test-suite copied to clipboard