scryer-prolog icon indicating copy to clipboard operation
scryer-prolog copied to clipboard

Potential further generalization of char_type/2

Open triska opened this issue 1 year ago • 1 comments

Thanks to the suggestion made by @librarianmage in #2321, char_type/2 is now significantly more general than before.

Currently, at least one argument of char_type/2 must be instantiated.

It is worth thinking about whether the predicate should be generalized even further and yield all solutions when asked in the most general way, for which it currently yields an instantiation error:

?- char_type(Char, Type).
   error(instantiation_error,must_be/2).

Since the set of solutions is finite, it is possible to enumerate all solutions. Should we though? Such a generalization could be interesting to answer questions such as "Which characters are supported at all by Scryer Prolog?", for which there currently is no easy answer within Prolog itself. Note that there is even a gap in the set valid characters codes! (See #2326.)

On the flip side, unintentionally calling the predicate with both arguments uninstantiated in Prolog code may unexpectedly slow down programs significantly because there are so many characters, and all of them must be enumerated. It may be better to yield an exception, since the most common way of invocation would seem to be one where at least one argument is instantiated, for instance because the character was previously read from a file or network stream.

There is no hurry to resolve this immediately in one way or the other. The most interesting point may be the underlying design question: Are there any general principles that could guide such design decisions for this predicate and also others? Or do we have to decide on a case-by-case basis for each predicate for which further generalizations are possible?

Is there any use for a generalized version of char_type/2 in addition to the one I mentioned above? Any additional points or suggestions, analogies or similarities with other predicates? I would greatly appreciate your input. Thank you a lot!

triska avatar Feb 11 '24 21:02 triska

This would possibly be useful in my project constrained.pl as a way to implement a clpz:label-like predicate for characters, strings and atoms. There is way too many chars to make this useful in general, but in theory I could implement char_type_c/2 and other similar constraints that limit the possible range of a character to make this more manageable. This would make it very easy to do stuff like "list all atoms of length 5 to 7 which only contain alphanumeric characters".

Edit: I just realized that the new changes already allow the "alphanumeric characters" use case and similar, which will probably be more common than the "any character supported" use case which is what this issue is about.

bakaq avatar Feb 11 '24 21:02 bakaq