ctags
ctags copied to clipboard
Add parser classes, xml, sexp, ini, and toml
I will leave Tokyo tomorrow. So I will give you some attractions:)
Like regex and xcmd, having xml parser class will be useful. We can cover svg, html, xhtml, ant, docbook, ...xpath can be used to specify interesting elements.
I found following code in a public header file of libxml2.
/**
* XML_GET_LINE:
*
* Macro to extract the line number of an element node.
*/
#define XML_GET_LINE(n) \
(xmlGetLineNo(n))
For lisp family, S expression parser class will be uesful.
I think current lisp related parsers are not useful. Generally lisp programmer introduce the application own define-something
with using define-macro/defmacro. Definitions defined with define-something
should be captured as tags.
Following are def
s in emacs I'm using.
def-edebug-spec defadvice
defalias default-boundp
default-file-modes default-font-height
default-indent-new-line default-line-height
default-toplevel-value default-value
defconst defcustom
defcustom-c-stylevar defface
defgroup defimage
define-abbrev define-abbrev-table
define-abbrevs define-alternatives
define-auto-insert define-button-type
define-category define-ccl-program
define-char-code-property define-charset
define-charset-alias define-charset-internal
define-coding-system define-coding-system-alias
define-coding-system-internal define-compilation-mode
define-derived-mode define-error
define-fringe-bitmap define-generic-mode
define-global-abbrev define-global-minor-mode
define-globalized-minor-mode define-hash-table-test
define-ibuffer-column define-ibuffer-filter
define-ibuffer-op define-ibuffer-sorter
define-key define-key-after
define-mail-abbrev define-mail-alias
define-mail-user-agent define-minor-mode
define-mode-abbrev define-obsolete-face-alias
define-obsolete-function-alias define-obsolete-variable-alias
define-prefix-command define-skeleton
define-translation-hash-table define-translation-table
define-widget define-widget-keywords
defined-colors defining-kbd-macro
defmacro defmath
defsubst deftheme
defun defvar
defvar-local defvaralias
Realizing the concept optlib is one of my primary motivation of working on ctags. However, now I recognize regex syntax I know is not so portable. It is just "syntax error" in MacOSX. regex on macosx is very limited. If we introduce a new parser class pcre, users can write a parser with more powerful syntax and portable way. I will never think making current regex parser obsolete but just introduce newer one.
Do you have more ideas about parser classes? Following code in parse.h is the start point.
typedef enum {
METHOD_NOT_CRAFTED = 1 << 0,
METHOD_REGEX = 1 << 1,
METHOD_XCMD = 1 << 2,
METHOD_XCMD_AVAILABLE = 1 << 3,
} parsingMethod;
Happy hacking.
https://github.com/arduino/ctags/blob/master/gir.c
This is very impressive parser. We should import this then generalize it.
$ ./ctags -o - --langdef=maven \
--xpath-maven="a,artifactId{}///*[local-name()='project' and namespace-uri()='http://maven.apache.org/POM/4.0.0']/*[local-name()='artifactId' and namespace-uri()='http://maven.apache.org/POM/4.0.0']/text()" pom.xml
build-tools-root pom.xml /^ <artifactId>build-tools-root</artifactId>$/;" a
Hard-coded version now works!!!
% ./ctags -x pom.xml
build-tools-root artifactId 9 pom.xml <artifactId>build-tools-root</artifactId>
Hey, @p-montanus, I need libxml2. What we should do in gentle way? What I did is:
--- a/Makefile.in
+++ b/Makefile.in
@@ -68,14 +68,15 @@ COVERAGE_CFLAGS=--coverage
COVERAGE_LDFLAGS=--coverage
endif
-ALL_CFLAGS = $(CFLAGS) --std=gnu99 -Wall $(COVERAGE_CFLAGS)
+ALL_CFLAGS = $(CFLAGS) --std=gnu99 -Wall $(COVERAGE_CFLAGS) `pkg-config --cflags libxml-2.0`
+
DEBUG_CPPFLAGS ?= -DDEBUG
ALL_CPPFLAGS = $(CPPFLAGS) \
$(DEBUG_CPPFLAGS) \
-DDATADIR=\"$(pkgdatadir)\" \
-DPKGCONFDIR=\"$(pkgsysconfdir)\" \
- -DPKGLIBEXECDIR=\"$(pkglibexecdir)\"
+ -DPKGLIBEXECDIR=\"$(pkglibexecdir)\"
include $(srcdir)/source.mak
@@ -173,7 +174,7 @@ V_CC_1 =
all: $(CTAGS_EXEC) $(READ_LIB) $(READ_CMD)
$(CTAGS_EXEC): $(OBJECTS)
- $(V_CC) $(CC) $(LDFLAGS) -o $@ $(OBJECTS) $(LIBS)
+ $(V_CC) $(CC) $(LDFLAGS) -o $@ $(OBJECTS) $(LIBS) `pkg-config --libs libxml-2.0`
$(READ_CMD): readtags.c readtags.h
$(V_CC) $(CC) -DREADTAGS_MAIN -I. -I$(srcdir) -I$(srcdir)/main $(DEFS) $(ALL_CPPFLAGS) $(ALL_CFLAGS) $(LDFLAGS) -o $@ $(srcdir)/readtags.c
Hey, @p-montanus, I need libxml2. What we should do in gentle way?
Luke, use ~~PKG_CONFIG_MODULES
~~ PKG_CHECK_MODULES
in configure.ac, use @*_CFLAGS@
and @*_LIBS@
in Makefile.in.
PKG_CHECK_MODULES([LIBXML2], [libxml-2.0], [: if-found], [: if-not-found])
LIBXML2_CFLAGS = @LIBXML2_CFLAGS@
LIBXML2_LIBS = @LIBXML2_LIBS@
ALL_CFLAGS += $(LIBXML2_CFLAGS)
LIBS += $(LIBXML2_LIBS)
Great. After merging your #592 and #601, I will put make a PR. Instead of targeting maven, I will rewrite ant parser with this new technology.
@ffes, @k-takata, and @cweagans, is libxml2 available on your maintained platform? I found I can implement a XML based parser easily with libxml2. I would like to use it in ctags. I would like to hear your comment about using libxml2.
(@masatake You misspelled my name. I have fixed it.)
I confirmed that MSYS2 has libxml2 packages (mingw-w64-i686-libxml2
and mingw-w64-x86_64-libxml2
), so it would be easy to use libxml2 on MSYS2.
But I'm not sure we can use it on MSVC. (Maybe we can, but not so easy I think.)
(@masatake You misspelled my name. I have fixed it.)
I'm very sorry.
I confirmed that MSYS2 has libxml2 packages (
mingw-w64-i686-libxml2
andmingw-w64-x86_64-libxml2
), so it would be easy to use libxml2 on MSYS2. But I'm not sure we can use it on MSVC. (Maybe we can, but not so easy I think.)I confirmed that MSYS2 has libxml2 packages (mingw-w64-i686-libxml2
andmingw-w64-x86_64-libxml2
), so it would be easy to use libxml2 on MSYS2. But I'm not sure we can use it on MSVC. (Maybe we can, but not so easy I think.)
Thank you for the comment.
Instead of reworking on ant.c, it will be better to create main/lxpath.c. So I can put all libxml2 related ifdef/endif into the one file.
Luke, use
PKG_CONFIG_MODULES
in configure.ac, use@*_CFLAGS@
and@*_LIBS@
in Makefile.in.
Spelled PKG_CHECK_MODULES
it is Obi-Wan ;)
Spelled PKG_CHECK_MODULES it is Obi-Wan ;)
Spelling is fixed, peacefully. May the Force be with you.
Hi folks what is the status of this one? Is some help needed? I came here while investigating how to generate good tags for Clojure.
Meta sexp parser has two aspects.
- it can be used as a kind of template for parsers like elips, cl, scheme, and, Clojure.
- it helps to capture user-defined defX in the parsers. In other words, the sexp meta parser helps a ctags user writing a subparser in the parsers like elips, cl, scheme, and, Clojure. About the concept, subparser, see http://docs.ctags.io/en/latest/running-multi-parsers.html?highlight=subparser .
I think, what I want is understandable to lisp hackers. The idea is very attractive to me. However, I don' have time to work on it. If you are interested in lisp family, you can try to implement it.
If you are just interested in Clojure, you can implement it with a crazy mtable meta parser. See http://docs.ctags.io/en/latest/optlib.html?highlight=mtable#byte-oriented-pattern-matching-with-multiple-regex-tables . It is not documented well. See also https://github.com/universal-ctags/ctags/issues/1620 .
Ok thanks! This information is very valuable, I will see what I can do!
class (meta parser) | C level | Optlib level | note |
---|---|---|---|
regex | yes | YES | |
libxml(xpath) | yes | no | See #3897 |
libyaml | yes | no | Not so useful. libypath is needed. |
S expression | no | no | This should cover clojure, elisp, lisp, scheme. |
json | no | no | We have json parser. |
iniconf | no | no | We have iniconf parser. |
toml | no | no | |
packci | no | no | interpreter version of packcc |