ctrlp-py-matcher
ctrlp-py-matcher copied to clipboard
Tag search in ctrlp-py-matcher
@FelikZ ,
thanks for creating this plugin.. it really sped up the file/MRU search.
However, when I tried to search tags, it didn't really show any tags.
I added the following line:
let g:ctrlp_match_func = { 'match': 'pymatcher#PyMatch' }
to my .vimrc
, and it led to fast fuzzy searching for files. However, when I tried to search tags using :CtrlPTag
, with a tag Database
from one of my files, it gave no results.
Then I commented out let g:ctrlp_match_func = { 'match': 'pymatcher#PyMatch' }
, and did :CtrlPTag
with a tag Database
from one of my files. This time, it was slow to find that tag, but it did fine the tag.
Is there any additional settings I should use to enable tag search using this plugin?
Hi @alphaCTzo7G , I am not 100% if it's possible to fix that by configuration change but I didn't used CtrlPTag with this plugin at all. I think it might require an update to support this. Currently I am not using vim anymore for a long time, so I will not promise you to implement this. Sorry!
@FelikZ .. thanks for your help..
Actually I found that :CtrlPTag
does work for most repos using this plugin, and it vastly speeds up search..
For some reason, if I have a certain set of characteristics of my repo:
-
python repo
- has a large
env
folder - large code base
CtrlPTag
stops working.. Otherwise it works: https://github.com/ctrlpvim/ctrlp.vim/issues/442#issuecomment-396689275
I will try to figure out why this is so..
In general CtrlPTag
works well.. but for certain libraries it fails.
I was able to create a small test.. Seems like the problem happens only when there are certain libraries like jinja2
in the virtual env, but doesn't happen for all libraries..
I created a small test repo here: https://github.com/alphaCTzo7G/test to reproduce the problem..
@FelikZ , I figured out the root cause.. It because vim.eval
crashes when the a:items
contains BOM
fields.
It seems that @ludovicchabant also faced the same issue and had to modify https://github.com/ludovicchabant/vim-gutentags to handle the issue: https://ludovic.chabant.com/devblog/2017/02/25/aaa-gamedev-with-vim/
His modification of ctrlp-py-matcher is here: https://github.com/ludovicchabant/ctrlp-py-matcher/blob/2f6947480203b734b069e5d9f69ba440db6b4698/autoload/pymatcher.py#L22, but it doesn't solve the issue.. just tells you want happened.
Its not possible to get any tag
search.. so I am wondering if you know of a replacement of vim.eval
which may be able to accept variables/string which could contain BOM fields.
From this line, it seems that VIM
is passing the items
to pymatcher#PyMatch
.
I wonder if you know of any other way to evaluate a variable in the vim
namespace and transfer it to the python
namespace other than going through the vim.eval
route?
@alphaCTzo7G
if you know of a replacement of
vim.eval
None that I know. The only thing that comes to the mind, maybe its possible to convert BOM to non-BOM before eval takes place?
@FelikZ.. actually I was wrong.. it turns out that different files had different problems _identifier.py
had valid utf-8
characters.. but ctags
would take _identifier.py
and cut certain string incorrectlyhttps://github.com/universal-ctags/ctags/issues/1805
@b4n has rectified this issue in this PR: https://github.com/universal-ctags/ctags/pull/1807
There are other files which break misc.html
but misc.html
has a invalid utf-8
character, which ctags
seems to insert as is. vim.eval
seems to break because of the invalid utf-8
character`.
Perhaps there should be a warning that because of this limitation, users should use the PR from here: https://github.com/universal-ctags/ctags/pull/1807, otherwise vim.eval
might break.
Excuberant-ctags upon which universal-ctags
is built, actually works fine. So if you use sudo apt-get install ctags
, the ctags
executable that is installed can handle this particular issue fine.. but may have other side effects or missing features. I will update this as I find out more about this.
The problem with misc.html
can be avoided if there is a function in VIML which can parse the string in a:items
and remove any invalid utf-8 character before passing it to python in vim.eval
. Do you know if such a function exists?
@alphaCTzo7G misc.html
is encoded in windows-1252, which obviously isn't UTF-8. For various reasons uctags doesn't translate any encodings (there are long topics you can find about that on uctags issue tracker, but the main reason is that it's incredibly hard to do right -- and there are some cases where there is no good solution), and sadly one of the possible issues arising is having an output tag file with mixed encodings when processing multiple files.
ectags never did anything with encodings either (actually, way less), and the difference you see are improper truncation in uctags (whereas ectags didn't truncate at all, and you can get this behavior in uctags with --pattern-length-limit=0
) which is what I fixed in the linked PR. In the case of misc.html
it's likely that uctags is simply outputting more tags than ectags did.
Anyway, be it ectags or uctags, they are likely to output non-UTF-8 tags if the source files are not already UTF-8. This is even becoming more and more prominent with languages allowing Unicode identifiers, and uctags parsers supporting those.
However, after all me saying uctags doesn't handle encoding conversion, it actually partly does if you tell it to though the --input-encoding
(and/or --input-encoding-<LANG>
) and --output-encoding
options. This support is not perfect yet (I didn't follow this recently, but it used to be limited), and it requires the caller to decide on encodings, but it might help in some circumstances.
At any rate, you should probably find a way for invalid UTF-8 not to break your software, because it's unfortunately unlikely to be "fixed" anytime soon, and it's a fairly common thing (in most cases it'll be because the source was ISO 8859-something
or Windows-something
or alike, and on rare occasions the source will actually be utter garbage with mixed things and random bytes here and there -- been there, seen that).
Appreciate your advice.. the try.except
that @ludovicchabant implemented might be helpful.. and I will try to see if vim.eval
can be modified or if there is a function in vim
that replace illegal characters before passing to vim.eval
.
@alphaCTzo7G I found a similar situation when work on a specific project. And I finally found that it is because the fileencoding
of tags file is not utf-8
for some reason. You can type :set fileencoding
to see. You just need to type :set fileencoding=utf-8
and save then everything will be fine !
By the way, I create a fork since this repository doesn't update anymore. And I add some new feature on it . You can have a try my fork