AsyncioIntegration breaks tracing inside ThreadPoolExecutor
How do you use Sentry?
Sentry Saas (sentry.io)
Version
2.30.0
Steps to Reproduce
- Init Sentry with AsyncioIntegration.
- Create ThreadPoolExecutor
- Create async task using asyncio.create_task
- Inside the task call loop.run_in_executor
Code to reproduce the issue:
import asyncio
import threading
from concurrent.futures import ThreadPoolExecutor
from time import sleep
import sentry_sdk
from sentry_sdk.integrations.asyncio import AsyncioIntegration
def init_sentry():
sentry_sdk.init(
dsn="",
traces_sample_rate=1.0,
integrations=[
AsyncioIntegration(),
],
)
@sentry_sdk.trace
def task_1():
print(f"Task 1 thread ID: {threading.get_ident()}")
sleep(0.5)
async def main():
init_sentry()
print(f"Main thread ID: {threading.get_ident()}")
executor = ThreadPoolExecutor()
for i in range(3):
task = asyncio.create_task(worker_task(executor, i))
await task
async def worker_task(executor: ThreadPoolExecutor, i: int):
with sentry_sdk.start_transaction(
op="function", name="main_thread"
):
print(f"\nWorker task {i} thread ID: {threading.get_ident()}")
loop = asyncio.get_event_loop()
await loop.run_in_executor(executor, task_1)
if __name__ == "__main__":
asyncio.run(main())
Expected Result
Spans created in tasks that run in a ThreadPoolExecutor should be included in the transactions sent to Sentry when AsyncioIntegration is enabled.
Actual Result
Spans created in tasks that run in a ThreadPoolExecutor are missing from the transactions sent to Sentry.
Only the first execution contains all the spans:
Subsequent executions are missing spans created inside executor:
Removing AsyncioIntegration integration solves the issue but then many automatic async related spans are missing. Our application uses asyncio but executes many CPU intensive blocking tasks inside executor.
I also tried Sentry 3.0.0a2 and it has the same issue.
Hey @ollipa, thanks for writing in.
I think the reason why this doesn't work out of the box is that the thread pool might reuse threads. Important SDK metadata that governs things like span relationships lives on scopes, and we by default propagate scopes only when a new thread is started -- this is likely why the first transaction has the span attached.
You can try something like this to manually propagate the scope each time:
import asyncio
import threading
from concurrent.futures import ThreadPoolExecutor
from time import sleep
import sentry_sdk
from sentry_sdk.integrations.asyncio import AsyncioIntegration
from sentry_sdk.integrations.threading import ThreadingIntegration
def init_sentry():
sentry_sdk.init(
traces_sample_rate=1.0,
integrations=[
AsyncioIntegration(),
],
)
@sentry_sdk.trace
def task_1():
print(f"Task 1 thread ID: {threading.get_ident()}")
sleep(0.5)
def task_1_with_context(isolation_scope, current_scope):
# Set the correct scopes before executing task_1
with sentry_sdk.scope.use_isolation_scope(isolation_scope.fork()):
with sentry_sdk.scope.use_scope(current_scope.fork()):
task_1()
async def main():
init_sentry()
print(f"Main thread ID: {threading.get_ident()}")
executor = ThreadPoolExecutor()
for i in range(3):
task = asyncio.create_task(worker_task(executor, i))
await task
async def worker_task(executor: ThreadPoolExecutor, i: int):
with sentry_sdk.start_transaction(op="function", name=f"main_thread_{i}"):
print(f"\nWorker task {i} thread ID: {threading.get_ident()}")
isolation_scope = sentry_sdk.get_isolation_scope()
current_scope = sentry_sdk.get_current_scope()
loop = asyncio.get_event_loop()
await loop.run_in_executor(executor, task_1_with_context, isolation_scope, current_scope)
if __name__ == "__main__":
asyncio.run(main())
At least in this example it correctly attaches the span to each transaction. Let me know if this works for you.
Thank you @sentrivana your workaround fixes the issue. I created this function that can be used to replace run_in_executor calls:
T = TypeVar("T")
async def run_in_executor_with_tracing(
loop: asyncio.AbstractEventLoop,
executor: ThreadPoolExecutor,
func: Callable[..., T],
*args: Any,
) -> T:
"""
Run a function in executor with proper Sentry tracing context.
"""
isolation_scope = sentry_sdk.get_isolation_scope()
current_scope = sentry_sdk.get_current_scope()
def func_with_context(*func_args: Any) -> T:
with sentry_sdk.scope.use_isolation_scope(isolation_scope.fork()):
with sentry_sdk.scope.use_scope(current_scope.fork()):
return func(*func_args)
return await loop.run_in_executor(executor, func_with_context, *args)
As this has been resolved with the workaround, I will close the issue.