raven-python
raven-python copied to clipboard
raven.utils.__init__.get_versions extremely slow (on aws)
we ran into a situation that this call takes 12-15 seconds in a Django app on first access of a log function after startup.
this leads to the situation that first hits to log.error or log.warn or anything that has the sentry handler configured are very very slow even if the default http handler is async.
I didnt dive any deeper, and adding 'include_versions': False in the RAVEN_CONFIG, disabled it, but it might be wise not to have this disabled by default or add a warning somewhere or initilaize this in the async part of the request logging if possible, might make it a bit better.
I believe I'm experiencing the same issue. I have a large Django app with 10s of apps. I ran line profile on get_versions
with my INSTALLED_APPS
as the argument and I found this (get_version_from_app
was:
where all the time was being taken):
Total time: 32.966 s
File: /home/pete/.venvs/project/local/lib/python2.7/site-packages/raven/utils/__init__.py
Function: get_version_from_app at line 61
Line # Hits Time Per Hit % Time Line Contents
==============================================================
61 def get_version_from_app(module_name, app):
62 142 137 1.0 0.0 version = None
63
64 # Try to pull version from pkg_resource first
65 # as it is able to detect version tagged with egg_info -b
66 142 111 0.8 0.0 if pkg_resources is not None:
67 # pull version from pkg_resources if distro exists
68 142 63 0.4 0.0 try:
69 142 32963944 232140.5 100.0 return pkg_resources.get_distribution(module_name).version
70 136 284 2.1 0.0 except Exception:
71 136 72 0.5 0.0 pass
72
73 136 310 2.3 0.0 if hasattr(app, 'get_version'):
74 1 0 0.0 0.0 version = app.get_version
75 135 146 1.1 0.0 elif hasattr(app, '__version__'):
76 5 4 0.8 0.0 version = app.__version__
77 130 154 1.2 0.0 elif hasattr(app, 'VERSION'):
78 1 0 0.0 0.0 version = app.VERSION
79 129 115 0.9 0.0 elif hasattr(app, 'version'):
80 version = app.version
81
82 136 129 0.9 0.0 if callable(version):
83 1 10 10.0 0.0 version = version()
84
85 136 278 2.0 0.0 if not isinstance(version, (string_types, list, tuple)):
86 129 65 0.5 0.0 version = None
87
88 136 68 0.5 0.0 if version is None:
89 129 48 0.4 0.0 return None
90
91 7 10 1.4 0.0 if isinstance(version, (list, tuple)):
92 version = '.'.join(map(str, version))
93
94 7 8 1.1 0.0 return str(version)
I haven't confirmed this is the cause of my problem in production but I believe it is at the moment. From a quick Google it seems that pkg_resources.get_distribution
is known to be slow.
Not really an solution to the problem, but if you are okay with dropping versions from your sentry logs, adding 'include_versions': False
to RAVEN_CONFIG
seems to do the trick for me on Django.
Affects us as well, 'include_versions': False
makes startup time go from 28 to 5 seconds.
I haven't profiled this myself, but reading through the comments and code in get_version_from_app
, it seems like pkg_resources.get_distributions
is the slow part and it's getting called on everything in INSTALLED_APPS
. get_versions_from_app
seems to have a lot of shortcuts to avoid the pkg_resources.get_distributions
but they get called only if the get_distributions
can't be imported.
It seems like https://github.com/getsentry/raven-python/compare/master...AlexRiina:master might help.