scryer-prolog
scryer-prolog copied to clipboard
¤ cannot be used
Currently, I get:
?- X = ¤. error(syntax_error(unexpected_char),read_term/3:0). ?- X = '¤'. error(syntax_error(invalid_single_quoted_character),read_term/3). ?- X = "¤". error(syntax_error(missing_quote),read_term/3:0).
Is there anything special about this character? Why cannot it be used like other symbols/currency characters such as $
?
For comparison, I get with GNU Prolog:
| ?- X = '¤'. X = '¤' yes
On the other hand, the toplevel reports the character when asked for it in a different way:
?- char_code(C, 164). C = '¤'.
This means that such answer substitutions currently cannot be pasted back as queries.
We need to identify that class of characters which can be written directly with single or double quotes.
non quote char (* 6.4.2.1 *)
= graphic char (* 6.5.1 *)
| alphanumeric char (* 6.5.2 *)
| solo char (* 6.5.3 *)
| space char (* 6.5.4 *)
| meta escape sequence (* 6.4.2.1 *)
| control escape sequence (* 6.4.2.1 *)
| octal escape sequence (* 6.4.2.1 *)
| hexadecimal escape sequence (* 6.4.2.1 *) ;
which is what is used to define those characters that can appear in such quoted context. So extended characters (6.5) that look like characters may be added to non quote char
but neither to one of those defined above like graphic char
or alphanumeric char
.
You can also extend graphic char and alphanumeric char. Just read the second paragraph of extended characters (6.5):
Its say the contrary to what you wrote, when you wrote "but neither to one of those defined above like graphic char or alphanumeric char"
If you use Unicode database, the character is classified as CURRENCY_SYMBOL:
SWI Prolog and Dogelog Player put CURRENCY_SYMBOL in the category graphic char. A typical currency symbol is $ which behaves like this:
?- X = $$$ .
X = $$$ .
?- X = $$$6 .
ERROR: Syntax error: Operator expected
?- X = '$$$6' .
X = '$$$6'.
Now ¤ is also a currency symbol just like $, so it could behave the same. Thats why both Prolog systems behave as follows:
?- X = ¤¤¤ .
X = ¤¤¤ .
?- X = ¤¤¤6 .
ERROR: Syntax error: Operator expected
?- X = '¤¤¤6' .
X = '¤¤¤6'.
So when you put it into the category graphic char, there is no need to use quotes around it, if its only a sequence of graphic characters.
Interestingly Trealla has no problem reading it, unlike Scryer Prolog, but it seems to me there is also a little glitch in the writing:
https://github.com/trealla-prolog/trealla/issues/26
¬ has the same problem (see also #1591):
?- char_code(C, 172). C = '¬'. ?- C = '¬'. error(syntax_error(invalid_single_quoted_character),read_term/3).
Similar problem but not exactly the same solution, in case Scryer Prolog would use some Unicode database,
you would need to map a different Unicode general category. You can check SWI-Prolog:
?- unicode_property(¬, category(X)).
X = 'Sm'.
?- unicode_property(¤, category(X)).
X = 'Sc'.
You can also extend graphic char and alphanumeric char.
It is a possibility to do so. And for alphanumeric chars Scryer already does this. The question here is rather whether or not it makes sense to extend graphic char
which may make the source code much less readable and reliable. Libraries like TPTP have refrained from this, and this although it was suggested in related discussions.
The other question is which non-terminals may be extended. In 6.5 graphic char
, alphanumeric char
, solo char
, layout char
and meta char
are mentioned. From NOTE 2 it becomes evident that also small letter char
and capital letter char
can be extended separately. So far in the standard there is no character that can be used in a quoted context but not in another context. But here the mentioned non quote char
may be a safer choice for extension. This would be also a bit closer to the way other programming languages like C (WG14 N1518) do it.
Things would get even more complex, when also solo char
is extended. So some new characters would be graphic and others solo. Sticking to a more conservative extension seems preferable. Such a more conservative extension does not rule out the use of such symbols but requires that quotes are used to make them better visible.
TPTP (as of v8.1.0.0) is not a Unicode adaptation, it doesn't make use of the unicode database.
For example it has:
<lower_alpha> ::: [a-z]
https://www.tptp.org/TPTP/SyntaxBNF.html#lower_alpha
But in Unicode one has, for the range 0-255 (Basic Latin & Latin-1 Supplement Block):
<lower_alpha> ::: [abcdefghijklmnopqrstuvwxyzµßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ]
Hope this helps!
There is no TPTP-Unicode yet. But there are many Prolog-Unicode already!
If you would like to be TPTP compatible (versus v8.1.0.0), you would need to ban this:
$ target/release/scryer-prolog -v
"v0.9.0-181-g8e9302ea"
$ target/release/scryer-prolog
?- X = hörgerät.
X = hörgerät.
So Scryer Prolog is now somewhere between TPTP and a Prolog Unicode, neither fish nor fowl, I cannot parse this, doesn't work in Scryer Prolog:
$ cat text.pl
text("«The Logos of Cybele is the idea that the Great Mother creates\n\
and kills everything. It is not eternity (Apollo) or the circle\n\
(Dionysus), but something that acts in her way with blind\n\
and absolute power. A form of progress: bottom-up growth.\n\
We are experiencing the final attack of Cybele, of the Great Risen\n\
Mother, with feminism, artificial intelligence, globalization,\n\
democracy, liberalism, and so on»").
I get this error:
$ target/release/scryer-prolog
?- ['text.pl'].
error(syntax_error(missing_quote),read_term/3:0).
false.
Works fine in SWI-Prolog:
?- text(X), write(X), nl, fail; true.
«The Logos of Cybele is the idea that the Great Mother creates
and kills everything. It is not eternity (Apollo) or the circle
(Dionysus), but something that acts in her way with blind
and absolute power. A form of progress: bottom-up growth.
We are experiencing the final attack of Cybele, of the Great Risen
Mother, with feminism, artificial intelligence, globalization,
democracy, liberalism, and so on»
true.
This works perfectly now, thank you a lot!
?- X = ¤. X = ¤.
I am reopening this because I think it is expected (https://github.com/mthom/scryer-prolog/issues/1749#issuecomment-1484061231) that the symbol cannot be part of a letter token:
?- X = ¤a. X = ¤a, unexpected.
Now it seems to work really perfectly, thank you a lot!