lci
lci copied to clipboard
String and colon
I have a string, which sometimes comes with a leading :
. So I want to check for this, but right now it's not working. Here's my testing code. It doesn't matter if I get the String with GIMMEH
or a file.
HAI 1.3
CAN HAS STRING?
I HAS A string ITZ "::hello"
I HAS A char ITZ I IZ STRING'Z AT YR string AN YR 0 MKAY
VISIBLE char AN " - " AN "::"
BOTH SAEM char AN "::", O RLY?
YA RLY
VISIBLE "success"
NO WAI
VISIBLE "fail"
OIC
VISIBLE char AN " - " AN ":(3A)"
BOTH SAEM char AN ":(3A)", O RLY?
YA RLY
VISIBLE "success"
NO WAI
VISIBLE "fail"
OIC
BTW gives error: VISIBLE ":[char]" AN " - " AN "COLON"
BOTH SAEM ":[char]" AN "COLON", O RLY?
YA RLY
VISIBLE "success"
NO WAI
VISIBLE "fail"
OIC
I HAS A char2 ITZ I IZ STRING'Z AT YR string AN YR 1 MKAY
VISIBLE char2 AN " - " AN "h"
BOTH SAEM char2 AN "h", O RLY?
YA RLY
VISIBLE "success"
NO WAI
VISIBLE "fail"
OIC
I HAS A char3 ITZ "::"
VISIBLE char3 AN " - " AN "::"
BOTH SAEM char3 AN "::", O RLY?
YA RLY
VISIBLE "success"
NO WAI
VISIBLE "fail"
OIC
KTHXBYE
Any idea?
Thanks for mentioning this. I think the problem for the first and second comparison is that the second string is not getting interpolated before being compared. I added some debug output to the string comparison function to look at exactly what was getting compared, and this is what I found:
: - :
comparing ':' and '::'
fail
: - :
comparing ':' and ':(3A)'
fail
comparing ':[char]' and 'COLON'
fail
h - h
comparing 'h' and 'h'
success
: - :
comparing '::' and '::'
success
(For the third case, :[char] is attempting to use "char" as a unicode normative name and that form is used to go from normative name --> character but there is not a way right now to go the other way -- from character --> normative name.)
The specification does not state whether string values being compared should be interpolated or not. Right now, strcmp
is used to compare two uninterpolated strings.
This actually poses an interesting issue because if interpolation is never performed on string comparison (as is currently the case), then it's possible to compare an interpolated string ':' to an uninterpolated string '::' and get incorrect results. However, if interpolation is always performed on string comparison, then an interpolated string (':') would cause an error because a single colon is invalid syntax.
To solve this, the first thing that comes to my mind is a bit hacky. It would be to modify the ValueObject structure to include another field that denotes whether a string contained in the ValueObject has been interpolated or not. Then, make the string interpolation code ignore strings that have already been interpolated and always interpolate strings when performing a comparison.
I'll need to think this through before its implemented, but thanks for raising the issue.
Take your time. I don't work on something important, just noticed this strange behavior.
I just noticed that this happens with every escape-char. Smileys like :(
or :[
let the program crash because the closing bracket isn't available. If it is available, the program trys to get a hex out of the string.
How about converting the :
to a ::
in every string we get from a FILO? With this we can be sure, the string is as intended and we can compare it inside the code.
That could be one potential solution. Basically what you're suggesting is internally representing escape characters as how the user would type them instead of their ASCII code.
One issue could be that a single character (like :
) can expand to multiple representations under the current spec, as you've pointed out in your code (::
, :(3A)
, :[COLON]
) so we would need to specify this behavior. It seems reasonable to expand such characters to their simple form ::
, :)
, and so on.
Assuming there aren't any odd corner cases, this approach seems relatively clean and will probaly be the first thing I try. Thanks for the suggestion!
How's it going? Is there some progress?
Thanks for reminding me of this, I have some more time now so I will see if I can whip up a diff for this.
Anything new on this?
So! After looking at the code, the solution I ended up going with is a bit simpler than what we discussed. I think at some point, strings were interpolated whenever they were evaluated but I think I did some performance testing and found that a lot of time was spent interpolating strings that were just getting passed around (and not being compared). I put a note about this in the code a long time ago: https://github.com/justinmeza/lci/blob/master/interpreter.c#L1871-L1875 .
Anyway, the solution I went for is to check in interpretEqualityOpExprNode() to see if the values being compared are strings, and if so, to interpret them using castStringExplicit() before comparing them.
Note that BOTH SAEM ":[char]" AN "COLON", O RLY?
will cause an error because the :[NAME]
expansion only goes one way (according to the LOLCODE spec, this could always be changed): from a Unicode normative name to a character, but not from a character to a Unicode normative name.
Anyway, the output after the change (and commenting out the :[char]
test) is:
: - :
success
: - :
success
h - h
success
: - :
success
I'm going to make a unit test for this and then commit the diff.
I'm really sorry to get back on this, but I want to use this, where I'll get a string, which is :[ ] x [ ] y
- this will still crash my program. Even after the latest commit.
No worries! I completely forgot you mentioned that as well. I will check it out.
Hey @Lucki, is there an example where using the colon escape ::
does not work for what you want to do?
For example, to output the string you mention, I would do VISIBLE "::[ ] x [ ] y"
.
I'll get it from a string outside the program.
I'll get :
in a string, which isn't represented as a ::
inside my program.
Here a quick example:
HAI 1.4
CAN HAS STRING?
I HAS A string
GIMMEH string
VISIBLE string
KTHXBYE
Giving the string :[ ] x [ ] y
will crash the program.
Gotcha. Check out the latest commit on the future branch and let me know if there are any corner cases that it does not handle.
Is there a difference if I get this string over a socket?
It works now with GIMMEH
, but GET
still crashes with this string.
OK, I added some code that sanitizes the inputs from outside sources (currently STDIO'Z LUK
and SOCKS'Z GET
). This should properly escape the colon character on input!
Thank you, seems to work so far :+1:
I'm really sad because I've found another use case with this problem:
HAI 1.4
CAN HAS STRING?
I HAS A string ITZ SMOOSH "::" AN "[" MKAY
VISIBLE string
KTHXBYE
Expected closing square bracket after :[