ctags icon indicating copy to clipboard operation
ctags copied to clipboard

--languages='C' somehow disables +p ?

Open mulle-nat opened this issue 7 months ago • 9 comments

This is easy to reproduce. As soon as I pedantically add --languages='C' the prototype is no longer found:

$ echo "int foo( void);" > a.h
$ ls -1 a.h | ctags -L - --kinds-C='f+p' --output-format=xref
foo              prototype     1 a.h              int foo( void);
$ ls -1 a.h | ctags -L - --languages='C' --kinds-C='f+p' --output-format=xref

Universal Ctags 5.9.0, Copyright (C) 2015 Universal Ctags Team Universal Ctags is derived from Exuberant Ctags. Exuberant Ctags 5.8, Copyright (C) 1996-2009 Darren Hiebert Compiled: Sep 3 2021, 18:12:18 URL: https://ctags.io/ Optional compiled features: +wildcards, +regex, +gnulib_regex, +iconv, +option-directory, +xpath, +json, +interactive, +sandbox, +yaml, +packcc, +optscript

Linux peschel 6.8.0-71-generic #71-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 22 16:52:38 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

mulle-nat avatar Aug 04 '25 12:08 mulle-nat

Look at the language field for an easy explanation:

$ echo "int foo( void);" > a.h
$ ls -1 a.h | ./ctags -L - --kinds-C='f+p' --fields=+l -o -
foo	a.h	/^int foo( void);$/;"	p	language:C++	typeref:typename:int
$ ls -1 a.h | ./ctags -L - --languages=C,C++ --kinds-C='f+p' --fields=+l -o -
foo	a.h	/^int foo( void);$/;"	p	language:C++	typeref:typename:int
$ ls -1 a.h | ./ctags -L - --languages=C --kinds-C='f+p' --fields=+l -o -
$ ls -1 a.h | ./ctags -L - --languages=C --langmap=C:+.h --kinds-C='f+p' --fields=+l -o -
foo	a.h	/^int foo( void);$/;"	p	language:C	typeref:typename:int

b4n avatar Aug 04 '25 20:08 b4n

Easy ? 😆 That's obscure, that headers are not part of C but of C++. But thanks.

mulle-nat avatar Aug 04 '25 22:08 mulle-nat

I don't know what lead to this in ctags, but it's a pretty common thing to do, because C++ has the bad habit of using the .h extension, but C headers almost always work as C++, so it's usually safe.

If you don't like that, you can change the default langmap configuration.

b4n avatar Aug 05 '25 07:08 b4n

$ ctags --list-maps=C
C        *.c
$ ctags --list-maps=C++
C++      *.c++ *.cc *.cp *.cpp *.cxx *.h *.h++ *.hh *.hp *.hpp *.hxx *.inl *.C *.H *.CPP *.CXX
$ ctags --print-language /tmp/a.h 
/tmp/a.h: C++
$ ctags --languages=C --print-language /tmp/a.h
/tmp/a.h: NONE

If you disable all parsers other than C parser with --languages=C, ctags cannot find a parser for .h file.

masatake avatar Aug 06 '25 05:08 masatake

I don't know what lead to this in ctags, but it's a pretty common thing to do, because C++ has the bad habit of using the .h extension, but C headers almost always work as C++, so it's usually safe.

I mean I can see how I can circumvent it now, by "just" using C++. But i don't understand why "C++ has the bad habit of using .h", which is like the header extension of C since forever. Could it be that ctags has a 1:1 mapping of file extension to language and that this is the reason .h is not part of C ?

mulle-nat avatar Aug 06 '25 09:08 mulle-nat

Exuberant ctags, the ancestor of Universal ctags, uses 1:1 mapping. I extended it to N:1 in Universal ctags. e.g., Both Matlab parser and ObjectiveC parser have .m as their extension. Universal ctags has a selector that detects whether a given .m file is Matlab or ObjectiveC. https://github.com/universal-ctags/ctags/blob/475a7bc246c22cb6ef1c6659103977a9de544667/main/selectors.c#L139

However, I have not done the same for the C and C++ parsers for .h files. The two languages are too similar for me to write a good selector. I also want to avoid the overhead of pre-running selectors on the C parser. I also consider CLI compatibility between both implementations of ctags.

masatake avatar Aug 06 '25 23:08 masatake

@masatake i don't know if it's easy, but for the case here a "solution" could be to have .h for C with a lower precedence than for C++, so that if C++ is not enabled C has a chance to match it.

However it introduces a potentially unnoticed difference in .h parsing when enabling only C, as the OP shows it's not intuitive.

b4n avatar Aug 07 '25 07:08 b4n

@b4n, your idea makes sense. I've thought about introducing precedence. However, it never occurred to me to apply precedence to this issue.

We have option: --map-<LANG>=[+|-]<extension>|<pattern>. I wonder how to extend this option to support precedence.

If we assume the range of precedence is [0 ~ 99]. We can extend the option like --map-<LANG>=[+|-][<precedence>:]<extension>|<pattern>. If a user omits the <precedence>, ctags assigns 50 as the default.

--map-C++=+50:.h
--map-C=+51:.h

The smaller number has higher precedence.

I prefer to [-50 ~ 50] to [0 ~ 99]. However, the option format I described above doesn't fit [-50 ~ 50].

masatake avatar Aug 11 '25 16:08 masatake

The idea "precedence" looks better than #4270.

masatake avatar Oct 22 '25 02:10 masatake