scryer-prolog
scryer-prolog copied to clipboard
Alternative representation of library(json) objects as ordered pairs.
Alternative to #930 - Only one of these should be merged!
@triska I wonder if ordered pairs will be easier than association lists for folks to work with?
Pros of ordered pairs:
- Can be written by hand directly
- Can reason about them directly without worrying about internal representations
- Can take advantage of first argument indexing
- Can be converted to association lists by users who need association lists
Pros of association lists:
- Enforce unique keys constraint
- Efficient and predictably scaling access to elements, logarithmic in the number of elements.
If you think ordered pairs are the way to go, take a look at finite_list/1
in the source code and its use to adjust the ordered pairs search strategy. Would you do it differently?
This looks good too, and is very intuitive! If this is sufficient in practical JSON use, then this is a great representation!
I only have a comment on the name: What is now called "finite list" or "instantiated finite list" is simply called a list in Prolog, so I recommend renaming finite_list/1
to list/1
, and using the definition of instantiated_finite_list/1
for it, since it holds if and only if the argument is a list. There is no need for a nonvar/1
test here, instead simply symbolically state the cases for which the predicate holds.
The name "keysorted list" seems more telling than "ordered pairs".
Thank you @triska!
I think this representation is sufficient for practical JSON use - JSON objects with more than about 20 keys are highly abnormal in my experience.
Declaratively, the relationship between a keysorted list inside Prolog and JSON object members is keysort(InternalList, JsonPairs)
.
keysort/2
by itself wouldn't work though, primarily because it can't generate unsorted lists from keysorted ones.
My solution was to use the logically equivalent permutation/2
to generate unsorted lists from keysorted ones. To do that, I came up with finite_list/1
to test for a keysorted list of some concrete length as opposed to a list with a tail variable. Any uninstantiated keysorted list or a list with a tail variable would cause permutation/2
to blow up with lists of ever-increasing lengths on backtracking and lead to nontermination.
I hope what I'm trying to do makes sense... Really looking forward to your feedback about how to get there! :)
- [x] Aram to rename "ordpairs" to "keysorted" throughout the code
If the tail is a variable, then it is called a partial list.
A list follows the inductive definition:
3.62 empty list: The atom [] (nil). ... 3.99 list: Either the empty list or a non-empty list. ... 3.114 non-empty list: A compound term whose prin- cipal functor is the list constructor and whose second argument is a list.
That's interesting, I didn't know a partial list was excluded from the definition of a list!
In that case, would it be a best practice to use catch(must_be(list, OrdPairs), error(instantiation_error,_), false)
in place of finite_list(OrdPairs)
?
An instantiation error cannot be replaced by silent failure, since that would yield unsound results.
Raising an instantiation error is absolutely no problem, and in fact the desirable response if no sound decision can be made: An instantiation error only means that an instantiation could lead to an answer, for example if we run the Prolog program with a different execution strategy, or if more becomes known about the data. The "control" is not something we implement within the program, but something the Prolog system itself can apply and change flexibly, as long as all used predicates give sound results in all modes.
You can use list_si/1
from library(si)
to safely test for lists: It succeeds if the argument is a list, it fails if it is not a list, and it raises an instantiation error if no sound answer is possible (at this point in the execution). It is completely safe to use this check also as condition in if-then-else constructs. With "safe", I mean that you will not incorrectly rule out solutions.
For example, ( integer(X) -> Then ; Else )
is not a sound construct: If X
is not instantiated, it will incorrectly rule out the "Then" branch, since integer/1
(incorrectly) fails in this case. On the other hand, it is completely safe to write this as ( integer_si(X) -> Then ; Else )
, because integer_si/1
will throw an instantiation error if X
is not instantiated, thus indicating that there could be a solution if X
is instantiated.
The need to raise instantiation errors in such cases was well understood as early as 1973, see:
G. Battani, H. Meloni, 1973: Interpréteur du langage de programmation Prolog, Rapport de D.E.A. d'Informatique Appliquée. Groupe d'Intelligence Artificielle, U.E.R. de Luminy. Université d'Aix-Marseille.
library(si)
is how we finally get this desirable property from type tests.
I'm on board with instantiation errors, however the problem is that raising an instantiation error in this DCG would render it unusable for parsing. I'll have to give this some more thought...
- [x] Don't merge until a better way to deal with instantiation checks is implemented
Using string_si/1
from #943, I believe the below definition for uniquestringkeysortedlist_si/1
would work to check whether a variable is a list of pairs sorted by unique string keys soundly (ignoring the parsing issue for now).
:- use_module(library(pairs)).
uniquestringkeysortedlist_si(Pairs) :-
list_si(Pairs),
pairs_keys(Pairs, Keys),
maplist(string_si, Keys),
sort(Keys, Keys).
OK @triska, take a look at json_object//1
now and tell me whether I succeeded in preserving inference soundness while also retaining the ability to both parse and generate with one DCG.
My idea was simply to delay the instantiation error, if there was one, until we've had a chance to process the information from the body of the DCG.
With the addition of type errors we also get lots of nice checks like key uniqueness when both parsing and generating:
?- phrase(json_chars(keysorted(["x"-null,"x"-null])), Cs).
caught: error(type_error(uniquestringkeysortedlist,["x"-null,"x"-null]),json_object//1)
?- phrase(json_chars(J), "{\"x\":null,\"x\":null}").
caught: error(type_error(uniquestringkeysortedlist,["x"-null,"x"-null]),json_object//1)
This is not really relevant to the PR, but I'm just very excited that schemas are slowly starting to become possible with JSON objects as keysorted lists:
?- Msg = keysorted(["lat"-Lat, "lon"-Lon]), freeze(Lon, (number(Lon), Lon >= 0, Lon < 360)), freeze(Lat, (number(Lat), Lat >= -90, Lat =< 90)), phrase(json_chars(Msg), "{\"lon\":34,\"lat\":-50}").
Msg = keysorted(["lat"- -50,"lon"-34]), Lon = 34, Lat = -50
; false.
?- Msg = keysorted(["lat"-Lat, "lon"-Lon]), freeze(Lon, (number(Lon), Lon >= 0, Lon < 360)), freeze(Lat, (number(Lat), Lat >= -90, Lat =< 90)), phrase(json_chars(Msg), "{\"lon\":-34,\"lat\":-50}").
false.
A declarative construct like when/2
could be useful to delay checks in such cases: when(Condition, Goal)
.
freeze/2
can be regarded as a special case of when/2
, namely when(ground(Var), Goal)
.
Whenever you say anything, @triska, my first reaction is "that's impossible!" and then sometime later "that was obvious, why didn't I think of that?" 😅
Take a look at json_object//1
now and see if it meets your expectations.
Awesome, thank you a lot!
Now we just need string_si/1
(#943) to be finished first.
While I'm waiting for string_si/1
to be merged, I also rewrote json_character//1
in a similar fashion to use freeze/2
instead of instantiation checks.
Awesome, this all looks great!