logfire icon indicating copy to clipboard operation
logfire copied to clipboard

Failed span runs forever in UI

Open tonybaloney opened this issue 1 year ago • 3 comments

Description

Even though this span failed because of an exception, in the UI it shows as "ongoing" for over an hour. The process isn't even running anymore. This persists after refreshing as well.

screenshot 2024-05-01 at 16 45 12

tonybaloney avatar May 01 '24 06:05 tonybaloney

Thanks for reporting.

Any chance you could share your code with us?

Do you happen to have a generator with a break? I've seen similar things with GeneratorExit.

samuelcolvin avatar May 01 '24 07:05 samuelcolvin

It happened in the same span as the crash I reported in #62 so maybe that'll be reproducible?

screenshot 2024-05-01 at 18 10 01

Here's the code I used https://github.com/tonybaloney/azure-search-openai-demo/commit/a17664de8da7e96b484b84b22dd6db477eb77cb0

tonybaloney avatar May 01 '24 08:05 tonybaloney

I've seen something similar with OpenAI where I was breaking out of a generator response. I'm sure we can come up with a minimal reproduction and then fix.

Some background:

By default open telemetry only sends data about spans when they close, that makes sense in some scenarios, but it means you can't see anything until a span closes, which is obviously not great, when:

  1. you're watching your application live
  2. spans take more than about a second

So we spend a "start span" - basically a zero duration span at the start of each span so we can show it in the UI almost instantly. Then when the span finishes we send a span as usual that wraps the code in question.

What's happening here is that the start span is sent, but somehow the end/standard span is never sent.

What's weird is that AttributeError you reported in #62 should not be a problem.

samuelcolvin avatar May 01 '24 10:05 samuelcolvin

The UI now handles this better.

alexmojaki avatar Jun 03 '24 19:06 alexmojaki