ctags icon indicating copy to clipboard operation
ctags copied to clipboard

Request: Prolog Parser

Open TrentSe opened this issue 5 years ago • 27 comments

Creating an issue for extending universal-ctags to handle Prolog [1].

... as per Masatake's request [2]

[1] https://en.wikipedia.org/wiki/Prolog [2] https://github.com/universal-ctags/ctags/issues/1566#issuecomment-678057425

TrentSe avatar Aug 25 '20 02:08 TrentSe

SWI Prolog syntax grammar/BNF?

http://swi-prolog.996271.n3.nabble.com/SWI-Prolog-syntax-grammar-BNF-td11598.html

TrentSe avatar Aug 25 '20 02:08 TrentSe

SWI Prolog provide a clone of Emacs that (I believe*) parses Prolog:

https://www.swi-prolog.org/pldoc/man?section=pceemacs

https://en.wikipedia.org/wiki/SWI-Prolog#PceEmacs

  • I haven't used it; I'm familiar with Vim.

TrentSe avatar Aug 25 '20 02:08 TrentSe

Thank you.

I will write an initial minimum version of prolog parser. I expect you complete it. I would like you to read #2622. The issue tells how important designing in a parser development is.

Taken from [1]:

/* input.pl */
mother_child(trude, sally).
 
father_child(tom, sally).
father_child(tom, erica).
father_child(mike, tom).
 
sibling(X, Y)      :- parent_child(Z, X), parent_child(Z, Y).
 
parent_child(X, Y) :- father_child(X, Y).
parent_child(X, Y) :- mother_child(X, Y).

What kind of tags output do you expect? What should be tagged? What "kinds" should be assigned to the tags?

masatake avatar Aug 25 '20 04:08 masatake

Hey Masatake,

Cheers for coming back so quickly. I'll read #2622.

Thanks also for your efforts writing the minimum parser. I'll do what I can to complete it, though will probably need assistance.

TrentSe avatar Aug 25 '20 04:08 TrentSe

I'm not that familiar with ctags, so may not answer your question correctly.

Tags:

predicates: mother_child, father_child etc. atoms: tom, sally, variables: X, Y

Translation from Prolog speak:

"predicates" -> method (Prolog has no functions) "atom" -> static value (distinct from a string) "variable" -> :)

String -> string (text surrounded with ", for example "this is a SWI-Prolog string"

Text surrounded with ' are treated as an atom. abc and 'abc' are equivalent.

TrentSe avatar Aug 25 '20 04:08 TrentSe

I note that Prolog defines modules [1], for example:

36 :- module(charsio, 37 [ format_to_chars/3, % +Format, +Args, -Codes 38 format_to_chars/4, % +Format, +Args, -Codes, ?Tail 39 write_to_chars/2, % +Term, -Codes 40 write_to_chars/3, % +Term, -Codes, ?Tail 41 atom_to_chars/2, % +Atom, -Codes 42 atom_to_chars/3, % +Atom, -Codes, ?Tail 43 number_to_chars/2, % +Number, -Codes

Is the definition for the module 'charsio' [2].

Predicates listed in the list (from line 37 onwards) are added to the global namespace. Predicates not listed are still accessible, by prepending the module name. For example charsio:some_other_predicate.

Fyi, the number following the predicate name (e.g. atom_to_chars/3) is the arity; it specifies the number of arguments....

[1] https://www.swi-prolog.org/pldoc/man?section=modules [2] https://www.swi-prolog.org/pldoc/doc/SWI/library/charsio.pl?show=src

TrentSe avatar Aug 25 '20 04:08 TrentSe

If you have the time, here is a quick intro to Prolog:

https://www.youtube.com/watch?v=SykxWpFwMGs

It's 1 hour long, but you wouldn't need to watch anything like that much to see most of the syntax...

TrentSe avatar Aug 25 '20 05:08 TrentSe

Thank you but prolog knowledge is enough. Expected tags output is really needed. It is the area I cannot help you.

masatake avatar Aug 25 '20 05:08 masatake

Let's focus on smaller input:

mother_child(trude, sally).

What should be tagged? If we have perfect prolog parser in ctags, which tokens may ctags capture?

masatake avatar Aug 25 '20 05:08 masatake

Here's the syntax doco fro Gnu Prolog:

http://gprolog.org/manual/html_node/gprolog019.html

TrentSe avatar Aug 25 '20 05:08 TrentSe

mother_child(trude, sally).

If we have perfect prolog parser in ctags, which tokens may ctags capture?

I'm not that familiar with ctags yet. I can read more to be more helpful.

From what I know now, "mother_child" would be tagged as the equivalent of a C method, trude and sally as the equivalent of C strings.

... would it be helpful for me to write an "equivalent" C program and run it through u-ctags xref...?

My C is rusty, but it would be something like:

void mother_child( "trude", "sally" );

I know this isn't valid C; but trude and sally aren't variables here.

By comparison,

void mother_child( "trude", char *Name ) {
    printf( "%s%n", Name );
}

would be:

mother_child( trude, Name ) :- writeln( Name ).

in Prolog.

TrentSe avatar Aug 25 '20 05:08 TrentSe

O.k. I wrote minimum version of prolog parser.

$ cat input.pl 
mother_child(trude, sally).
$ cat prolog.ctags
--langdef=Prolog
--map-Prolog=.pl
--kinddef-Prolog=p,predicate,predicates
--regex-Prolog=/^([a-zA-Z_]+)\([^.]+\)./\1/p/
$ ./ctags --options=./prolog.ctags --languages=Prolog -o - input.pl 
mother_child	input.pl	/^mother_child(trude, sally).$/;"	p	language:Prolog

Is this the same as what you expect?

masatake avatar Aug 25 '20 05:08 masatake

That was quick!

TrentSe avatar Aug 25 '20 05:08 TrentSe

... I also think I see what you're doing. I can build on that...

TrentSe avatar Aug 25 '20 05:08 TrentSe

Let's extend the input a bit.

parent_child(X, Y) :- father_child(X, Y).
parent_child(X, Y) :- mother_child(X, Y).

The first question is parent_child should be tagged twice or once?

I guess you may want to have a arity: field like:

parent_child input.pl  /^...$/;"  p arity:2

Am I correct? if yes, what I should do for the input:

mother_child( trude, Name ) :- writeln( Name ).

Is the arity for mother_child 1 or 2?

masatake avatar Aug 25 '20 05:08 masatake

parent_child(X, Y) :- father_child(X, Y).
parent_child(X, Y) :- mother_child(X, Y).

The first question is parent_child should be tagged twice or once?

It shoud be tagged twice; they're multiple definitions of the predicate (method).

I guess you may want to have a arity: field like:

parent_child input.pl /^...$/;" p arity:2

Maybe, though I'm not sure how that would be used. Arity is just the count of predicate arguments...

mother_child( trude, Name ) :- writeln( Name ).

Is the arity for mother_child 1 or 2?

2 - the size of the argument list [ trude, Name ].

Everything after the :- is the "implementation" (in C terms... 😄 ).

TrentSe avatar Aug 25 '20 05:08 TrentSe

$ cat input.pl
/* input.pl */
mother_child(trude, sally).

father_child(tom, sally).
father_child(tom, erica).
father_child(mike, tom).
 
sibling(X, Y)      :- parent_child(Z, X), parent_child(Z, Y).
 
parent_child(X, Y) :- father_child(X, Y).
parent_child(X, Y) :- mother_child(X, Y).

% dummy0()
/* dummy1() */

$ cat optlib/prolog.ctags
--langdef=Prolog

--map-Prolog=.pl

###
# kind definitions
#
--kinddef-Prolog=p,predicate,predicates
--kinddef-Prolog=v,variable,variables

###
# table declarations
#
--_tabledef-Prolog=main
--_tabledef-Prolog=args
--_tabledef-Prolog=impl
--_tabledef-Prolog=comment
--_tabledef-Prolog=comment_multiline
--_tabledef-Prolog=comment_oneline
--_tabledef-Prolog=any
--_tabledef-Prolog=ignoreWhiteSpace

###
# utilities
#
--_mtable-regex-Prolog=any/.//
--_mtable-regex-Prolog=ignoreWhiteSpace/[ \t\n]+//

###
# comment
#
--_mtable-regex-Prolog=comment/\/\*//{tenter=comment_multiline}
--_mtable-regex-Prolog=comment/\%//{tenter=comment_oneline}

--_mtable-regex-Prolog=comment_multiline/\*\///{tleave}
--_mtable-extend-Prolog=comment_multiline+any

--_mtable-regex-Prolog=comment_oneline/\n//{tleave}
--_mtable-extend-Prolog=comment_oneline+any


###
# main
#
--_mtable-extend-Prolog=main+comment
--_mtable-extend-Prolog=main+ignoreWhiteSpace
--_mtable-regex-Prolog=main/([a-zA-Z_][a-zA-Z_0-9]*)/\1/p/{scope=push}
--_mtable-regex-Prolog=main/\(//{tenter=args}
--_mtable-regex-Prolog=main/\.//{scope=pop}
--_mtable-regex-Prolog=main/:-//{tenter=impl}
--_mtable-extend-Prolog=main+any

###
# args
#
--_mtable-extend-Prolog=args+comment
--_mtable-extend-Prolog=args+ignoreWhiteSpace
--_mtable-regex-Prolog=args/\)//{tleave}
--_mtable-regex-Prolog=args/[a-z,]+//
--_mtable-regex-Prolog=args/([A-Z][A-Za-z]*)/\1/v/{scope=ref}
--_mtable-extend-Prolog=args+any

###
# impl
#
--_mtable-extend-Prolog=impl+comment
--_mtable-extend-Prolog=impl+ignoreWhiteSpace

# If . is found, push back it. So the upper table (main) can handle it.
--_mtable-regex-Prolog=impl/\.//{tleave}{_advanceTo=0start}
--_mtable-extend-Prolog=impl+any

$ ./ctags --sort=no --options=./optlib/prolog.ctags --languages=Prolog -o - input.pl 
mother_child	input.pl	/^mother_child(trude, sally).$/;"	p	language:Prolog
father_child	input.pl	/^father_child(tom, sally).$/;"	p	language:Prolog
father_child	input.pl	/^father_child(tom, erica).$/;"	p	language:Prolog
father_child	input.pl	/^father_child(mike, tom).$/;"	p	language:Prolog
sibling	input.pl	/^sibling(X, Y)      :- parent_child(Z, X), parent_child(Z, Y).$/;"	p	language:Prolog
X	input.pl	/^sibling(X, Y)      :- parent_child(Z, X), parent_child(Z, Y).$/;"	v	language:Prolog	predicate:sibling
Y	input.pl	/^sibling(X, Y)      :- parent_child(Z, X), parent_child(Z, Y).$/;"	v	language:Prolog	predicate:sibling
parent_child	input.pl	/^parent_child(X, Y) :- father_child(X, Y).$/;"	p	language:Prolog
X	input.pl	/^parent_child(X, Y) :- father_child(X, Y).$/;"	v	language:Prolog	predicate:parent_child
Y	input.pl	/^parent_child(X, Y) :- father_child(X, Y).$/;"	v	language:Prolog	predicate:parent_child
parent_child	input.pl	/^parent_child(X, Y) :- mother_child(X, Y).$/;"	p	language:Prolog
X	input.pl	/^parent_child(X, Y) :- mother_child(X, Y).$/;"	v	language:Prolog	predicate:parent_child
Y	input.pl	/^parent_child(X, Y) :- mother_child(X, Y).$/;"	v	language:Prolog	predicate:parent_child

masatake avatar Aug 25 '20 06:08 masatake

I guess arity field and signature field should be filled.

masatake avatar Aug 25 '20 06:08 masatake

Feel free to reopen this.

masatake avatar Sep 15 '20 15:09 masatake

Hey, not sure it's still relevant but for years I've used ptags (a ctags-like compatible tool that works for prolog) and it created files that work like a charm with xemacs and nedit.

https://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/lang/prolog/util/ptags/

dseddah avatar May 13 '25 08:05 dseddah

Thank you for the information. However, using the code in ctags is limited:

/*
 * ptags - creates entries in a tags file for Prolog predicates
 * 
 * Usage: ptags [-w] [-l] [-a] [-p] file1 ... filen
 * 
 * This program code may be freely distributed provided
 * 
 *     a) it, or any part of it, is not sold for profit; and
------------------------------------^^^^^^^^^^^^^^^^^^^

masatake avatar May 13 '25 11:05 masatake

I wrote the author a mail to see if he'd be willing to change the license.

dseddah avatar May 13 '25 15:05 dseddah

Hi, Pr. Tweed answered me he's ok changing the licence. What licence would be acceptable for you guys?

Also can you give me your mail so I'll include you in my answer?

dseddah avatar May 13 '25 18:05 dseddah

Thank you. My email address is [email protected].

A.

GNU General Public License version 2 or (at your option) any later version.

that is found in https://github.com/universal-ctags/ctags/blob/master/COPYING

, B. Unlicense that is found in https://unlicense.org/ , or C. The MIT license that is found in https://opensource.org/license/mit .

masatake avatar May 13 '25 19:05 masatake

BTW, will you help me test the parser? We already have a prototype. https://github.com/universal-ctags/ctags/issues/2628#issuecomment-679816894

masatake avatar May 13 '25 19:05 masatake

yeah, of course. I haven't touched prolog code in a while, but sure it could be fun :)

dseddah avatar May 13 '25 19:05 dseddah

@dseddah, could you try #4249?

I have not used the ptags code directly.


$ cat input.pl
:- module(mymod, [a/0, a/1, a/2]).
a.
a(X) :-
    X is 1.
a(X, Y) :- X is Y.
a('abc', 'efg') :- false.

/*  /*
*/
dont_extract.
*/

prolog:message("msg").

b(X) --> c(X).
$ ./ctags --sort=no --fields=+Kne -o - ./input.pl
mymod	./input.pl	/^:- module(mymod, [a\/0, a\/1, a\/2]).$/;"	module	line:1
a	./input.pl	/^a.$/;"	predicate	line:2	module:mymod	end:2	arity:0
a/0	./input.pl	/^a.$/;"	predicate	line:2	module:mymod	end:2	arity:0
a	./input.pl	/^a(X) :-$/;"	predicate	line:3	module:mymod	end:4	arity:1
a/1	./input.pl	/^a(X) :-$/;"	predicate	line:3	module:mymod	end:4	arity:1
a	./input.pl	/^a(X, Y) :- X is Y.$/;"	predicate	line:5	module:mymod	end:5	arity:2
a/2	./input.pl	/^a(X, Y) :- X is Y.$/;"	predicate	line:5	module:mymod	end:5	arity:2
a	./input.pl	/^a('abc', 'efg') :- false.$/;"	predicate	line:6	module:mymod	end:6	arity:2
a/2	./input.pl	/^a('abc', 'efg') :- false.$/;"	predicate	line:6	module:mymod	end:6	arity:2
message	./input.pl	/^prolog:message("msg").$/;"	predicate	line:13	module:prolog	end:13	arity:1
message/1	./input.pl	/^prolog:message("msg").$/;"	predicate	line:13	module:prolog	end:13	arity:1
b	./input.pl	/^b(X) --> c(X).$/;"	grammar	line:15	module:mymod	end:15	arity:1
b/1	./input.pl	/^b(X) --> c(X).$/;"	grammar	line:15	module:mymod	end:15	arity:1

The question is whether ctags should extract the same predicate having the same arity more than once.

ptags made only one tag for them. My parser made each of them.

masatake avatar May 18 '25 09:05 masatake

... I also think I see what you're doing. I can build on that...

@TrentSe Any progress?

masatake avatar Sep 26 '25 04:09 masatake