Rexx-style comparisons need to be implemented
We have only 'strict' comparisons at the moment, at one time we need (loose?) standard Rexx comparisons. At that time, the SNES opcode needs to implement the current strict comparison that is handled by SNE at the moment. Short version:
SNE--> SNES SNE: TBD.
Peter, as this is millicode, I think this is a good one for you to look at.
I took this out of the Regina manual:
The non-strict comparative
operators will ignore leading or trailing blanks for string comparisons, and leading zeros for
numeric comparisons.
are these all options we need or are there more?
I think, as this is a potential performance killer also, we leave it for CREXX level C?
I believe we even added in BREXX the stripping-out of blanks in the middle of a string. so " abc def "="abcdef" quite frankly it's not just a performance killer it's a weird approach
https://en.wikibooks.org/wiki/Rexx_Programming/How_to_Rexx/string_comparison nothing about spaces in the middle of a string in here
from the standard: A.7.4.7 The value of a comparison Most REXX implementations have two forms of comparison, an 'exact' form which compares strings character by character and a form in which the comparison depends on whether the two strings are numbers. The latter comparisons are not transitive, which had led to proposals for a third form. In view of the fact that there is an idiom, a+0 < b+0 which can be written to force numeric comparison, extra comparison operators are not justified.
7.4.7 The value of a comparison See section 6.3.2.102 for the syntax of a comparison. If the comparison is a concatenation then the value of the comparison is the value of the concatenation. Otherwise, let lhs be the value of thecomparisonwithin it, and rhs be the value of theconcatenation within it. If the comparison has a comparison_operator that is a strict_compare then the variable #Test is set as follows: #Test is set to 'E'. Let Length be the smaller of Config_Length(lhs) and Config_Length(rhs). For values of n greater than 0 and not greater than Length, if any, in ascending order, #Test is set to the uppercased first character of #Outcome after: Config_Compare(Config_Substr(lhs),Config_Substr(rhs)). If at any stage this sets #Test to a value other than 'E' then the setting of #Test is complete. Otherwise, if Config_Length(lhs) is greater than Config_Length(rhs) then #Test is set to 'G' or if Config_Length(lhs) is less than Config_Length(rhs) then #Test is set to 'L'. If the comparison has a comparison_operator that is a normal_compare then the variable #Test is set as follows:
if datatype(lhs)\== 'NUM' | datatype(rhs)\== 'NUM' then do
/* Non-numeric non-strict comparison */
lhs=strip(lhs, 'B', ' ') /* ExtraBlanks not stripped */
rhs=strip(rhs, 'B', ' ')
if length(lhs)>length(rhs) then rhs=left(rhs,length(lhs))
else lhs=left(lhs,length(rhs))
if lhs>>rhs then #Test='G'
else if lhs<<rhs then #Test='L'
else #Test='E'
end
else do /* Numeric comparison */
if left(-lhs,1) == '-' & left(+rhs,1) \== '-' then #Test='G'
else if left(-rhs,1) == '-' & left(+lhs,1) \== '-' then #Test='L'
else do
Difference=lhs - rhs /* Will never raise an arithmetic condition. */
if Difference > 0 then #Test='G'
else if Difference < 0 then #Test='L'
else #Test='E'
end end
The value of #Test, in conjunction with theoperatorin thecomparison, determines the value of the comparison. The value of the comparison is '1' if – #Test is 'E' and the operator is one of '=', '==', '>=', '<=', '>', '<', '>>=', '<<=', '>>', or '<<'; – #Test is 'G' and the operator is one of '>', '>=', '<', '=', '<>', '><', '==', '>>', '>>=', or '<<'; – #Test is 'L' and the operator is one of '<', '<=', '>', '=', '<>', '><', '==', '<<', '<<=', or '>>'. In all other cases the value of the comparison is '0'. ANSI X3J18-199X
Thanks, René for the detailed explanations. I have started with string comparisons and might have something in the next couple of days.
I added the first sample of non restrictive comparison: commit 34e8151c1954bf271196c8081854080eb7fda709 . As the compiler doesn't support the distinction of restrictive vs non-restrictive we can only use it in an assembler instruction. I believe this is the fastest way to achieve without any copying of data, just using offsets and calculated lengths.
The following example illustrates its usage (assembler rseq x,a,b), it supports UTF8:
/* rexx test abs bif */
options levelb
a="This is René's test case "
b=" This is René's test case"
say '"Test case A "'a'"'
say '"Test case B "'b'"'
x=0
/* Test strict comparison REG/REG */
if a=b then say "equal (strict compare)"
else say "not Equal (strict compare)"
/* Test non strict comparison REG/REG */
assembler rseq x,a,b
if x=1 then say "EQUAL (non strict compare)"
else say "NOT EQUAL (non strict compare)"
return
This is just a temporary and incomplete solution as we need to decide if go this route (see discussion: non-strict comparison pros and cons #233).
correct me if I'm wrong, but the compiler does support it, but only through the language itself. This became clear when all the strict comparisons in the bif testcases needed to be replaced with 'normal' (looked at from the Rexx perspective) operators. So a == needed to become a = - and then generated the opcode for a non-strict comparison, which then was (is) implemented by a strict comparison implementation. So in my view, what needs to happen next is that the implementations of strict comparisons need to land on the right opcodes, the testcases need to be updated to do strict comparison again, and the normal (for Rexx) comparisons need to become the standard. We can decide otherwise for level B, but I don't see a real need because someone who will use level B will be in the know about using strict comparison whenever possible. As a sidenote: in NetRexx all string comparisons are loose, and case-insensitive, as was planned for Classic Rexx, but people made MFC change his mind, which he told me he has regretted ever since.
I didn’t realize, that it is already compiled into different opcodes:
if a=b then say "equal (strict compare)" if a==b then say "equal (strict == compare)"
become:
* Line 9: {IF} a=b
seq r4,r1,r2
brf l30iffalse,r4
* Line 9: {THEN}
* Line 9: say "equal (strict compare)"
say "equal (strict compare)"
* Line 11: {IF} a==b
seqs r4,r1,r2
brf l38iffalse,r4
SEQ will become SEQS, and the non-restrictive function seq which I wrote as RSEQ will go into SEQ
my new f0038 tests/precedence.rexx testcase fails on all of these: SEQS, SNES, SLTS, SGTS.
if this is done (in a special branch) all bif tests need changing and we can then elaborate the test suite with optimised versus non-optimised rxbin tests.
Note that the latest commit to develop (f0040) fixed a string compare bug - this might have helped with this
Just noticed this old issue from three years ago. Is it still something we need to look into?
I suggest a review of what the non strict requirements are, what we have, and what needs to be done. I think we should be in a position to put this one to bed now, we just need to work out the gap ...
In my opinion, we don't need this at level B.
For level C, we could implement it in one of two ways:
- Duplicate the string comparison instructions and apply the strip logic directly before performing the comparison.
- Have the compiler insert a simple strip operation (not the full STRIP function) prior to the comparison.
Option 1 might be slightly faster, but honestly, I wouldn’t worry about performance in this case — the idea of a non-strict comparison in this context seems a bit absurd from a language design standpoint. But, that’s just my not-so-humble opinion.
I’ve just discovered I started something over three years ago. Unsurprisingly, my memory has long since paged it out — and is unable to page it back in. Probably best to have a quick chat and figure out what (if anything) still needs doing — and when.
I think this needs to be done on an instruction level, also because levels B and C will share the same instruction set. We then can choose not to generate the nonstrict instructions for level B - but the problem at the moment is that every "=" is treated as an "==".