CTags
CTags copied to clipboard
Large tag files cause error when sorting
I'm on Mac OS X Lion. I've changed out /usr/bin/ctags with a proper ctags implementation. I'm new to Sublime Text and to CTags. I've never gotten CTags to work properly yet.
I'd love to get this working. Please help.
When attempting to build ctags, I'm getting the following error on every project I have (regardless of source tree - username removed):
Exception in thread Thread-5:
Traceback (most recent call last):
File "X/threading.py", line 639, in _bootstrap_inner
File "X/threading.py", line 596, in run
File "ctagsplugin in /Users/[user]/Library/Application Support/Sublime Text 3/Installed Packages/CTags.sublime-package", line 103, in run
File "ctagsplugin in /Users/[user]/Library/Application Support/Sublime Text 3/Installed Packages/CTags.sublime-package", line 682, in build_ctags
File "ctags in /Users/[user]/Library/Application Support/Sublime Text 3/Installed Packages/CTags.sublime-package", line 178, in build_ctags
File "ctags in /Users/[user]/Library/Application Support/Sublime Text 3/Installed Packages/CTags.sublime-package", line 157, in resort_ctags
File "X/encodings/ascii.py", line 26, in decode
UnicodeDecodeError: 'ascii' codec can't decode byte 0x96 in position 1003: ordinal not in range(128)
Just FYI, this is what I get when I attempt things on Linux:
Re/Building CTags for /home/[user]/development/comm2_boost_testing/src/Common/.tags: Please be patient
Traceback (most recent call last):
File "/home/[user]/Downloads/sublime_text_3/sublime_plugin.py", line 445, in is_enabled_
raise ValueError("is_enabled must return a bool", self)
ValueError: ('is_enabled must return a bool', <CTags.ctagsplugin.NavigateToDefinition object at 0x7f1ff86d0c10>)
Traceback (most recent call last):
File "/home/[user]/Downloads/sublime_text_3/sublime_plugin.py", line 445, in is_enabled_
raise ValueError("is_enabled must return a bool", self)
ValueError: ('is_enabled must return a bool', <CTags.ctagsplugin.NavigateToDefinition object at 0x7f1ff86d0c10>)
Traceback (most recent call last):
File "/home/[user]/Downloads/sublime_text_3/sublime_plugin.py", line 445, in is_enabled_
raise ValueError("is_enabled must return a bool", self)
ValueError: ('is_enabled must return a bool', <CTags.ctagsplugin.ShowSymbols object at 0x7f1ff86d0c50>)
Traceback (most recent call last):
File "/home/[user]/Downloads/sublime_text_3/sublime_plugin.py", line 445, in is_enabled_
raise ValueError("is_enabled must return a bool", self)
ValueError: ('is_enabled must return a bool', <CTags.ctagsplugin.ShowSymbols object at 0x7f1ff86d0c50>)
Traceback (most recent call last):
File "/home/[user]/Downloads/sublime_text_3/sublime_plugin.py", line 445, in is_enabled_
raise ValueError("is_enabled must return a bool", self)
ValueError: ('is_enabled must return a bool', <CTags.ctagsplugin.ShowSymbols object at 0x7f1ff86d0c50>)
Traceback (most recent call last):
File "/home/[user]/Downloads/sublime_text_3/sublime_plugin.py", line 445, in is_enabled_
raise ValueError("is_enabled must return a bool", self)
ValueError: ('is_enabled must return a bool', <CTags.ctagsplugin.ShowSymbols object at 0x7f1ff86d0c50>)
Exception in thread Thread-3:
Traceback (most recent call last):
File "X/threading.py", line 639, in _bootstrap_inner
File "X/threading.py", line 596, in run
File "ctagsplugin in /home/[user]/.config/sublime-text-3/Installed Packages/CTags.sublime-package", line 103, in run
File "ctagsplugin in /home/[user]/.config/sublime-text-3/Installed Packages/CTags.sublime-package", line 682, in build_ctags
File "ctags in /home/[user]/.config/sublime-text-3/Installed Packages/CTags.sublime-package", line 178, in build_ctags
File "ctags in /home/[user]/.config/sublime-text-3/Installed Packages/CTags.sublime-package", line 157, in resort_ctags
File "X/codecs.py", line 300, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 6202: invalid start byte
Well, after some investigating, it appears there is a memory issue or a buffer getting overloaded when the .tags file is large. In my case, for my source trees, the .tags file are greater than 25MB.
When the resort_tags
function runs against this very large file, it runs into the error. There are no issues with the files except that they are large.
If instead of calling the resort tags function, I create a subprocess
that runs a new file resort_tags.py
file (which runs the resort_tags
function), everything is fine.
Note: I manually installed CTags into the Packages folder. The folder is named CTags-Master as that's what was in the zip file I downloaded from this site.
As a quick and dirty workaround, in ctags.py
in build_ctags()
, instead of the call to resort_tags
, I did the following.
_WARNING: I AM NOT A PYTHON PROGRAMMER IN ANY SENSE. I'M A C/C++ PROGRAMMER. REWRITE TO PYTHON STANDARD PROGRAMMING PRACTICES._
resort_path = sublime.packages_path() + '/CTags-master/' + 'resort_tags.py'
resort_path = '"%s"' % resort_path
cmd2 = 'python ' + resort_path + ' ./.tags'
p2 = subprocess.Popen(cmd2, cwd = dirname(tag_file), shell=1, env=env, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
ret2 = p2.wait()
if ret2: raise EnvironmentError((cmd2, ret2, p2.stdout.read()))
I don't know if this is a python issue where the interpreter gets a hick-up or what. But this is regularly occurring with varying sizes of .tags files that are large, even when the source trees have unrelated code.
+1 I have the same problem in Mac OS X Mountain Lion
@rednecknguyen @astyagun I'm looking into this issue. Would either of ye happen to have a sample .tags
file that I could test again?
http://yadi.sk/d/wmFrx6U0BsZoR
You can also recreate it by generating tags for Rails framework for example.
- Clone Rails framework from Github repo (https://github.com/rails/rails)
- Run
bundle install --path bundle
to fetch all dependencies into subdirectory - Generate tags in Sublime Text
So I've looked into this. Problem seems to be because the sorting is taking place in memory - the built in Python interpreter in ST must have some enforced memory limit that this hits (hence why the solution @rednecknguyen proposed works - it spawns a new Python process outside of ST).
@rednecknguyen's solution (while good) isn't perfect though - it assumes that Python is installed in the system and basically ups the memory ceiling - the same issue could occur with a larger file again. I propose two possible solutions:
- Offload to
sort
in unix.- Pros: This would be far faster than anything we could achieve in Python.
-
Cons: While Windows does provide a
sort
utility it's very basic and won't let you sort on tabbed columns (as found in tag files). Hence we'd need to provide an alternative here.
- Reimplement the sort algorithm to use external sorts for large files
- Pros: Would work without any external requirements - it's pure Python after all.
- Cons: External sorts are slow. Even if you used a hybrid "sometimes-internal-sometimes-external" solution, deciding when to use an external vs. internal sort would be tricky.
Opinions anyone?
What about reimplementing sort if running on windows?
http://stackoverflow.com/questions/1325581/how-do-i-check-if-im-running-on-windows-in-python
Yeah - I considered that alright (point 1). However, we'll still have the same issue (albeit only on Windows). It's a kind of half-way solution that will only fix things for some people and add to the maintenance overhead. Good idea though :)
@astyagun @rednecknguyen I've pushed some changes to a feature branch. Would either of ye mind checking out that branch and seeing if it fixes things? Ye can see the changes made in the commits, but essentially there are now three ways to sort files:
- classic in-memory sort (0)
- python-based external bucket sort (1)
- GNU
sort
(per @davividal's suggestion) (2)
These can be configured by setting the value of sort
in settings, i.e. to enable the bucket sort:
{
"sort": 1
}
I can't reproduce the problem anymore. Either it's fixed in the version from Package Control already or some change in my setup has fixed it.
Well that's good to hear. Hopefully it's the former. I'll wait to see if any other reports of the issue arise and if not can I guess we consider this issue closed