camel-k icon indicating copy to clipboard operation
camel-k copied to clipboard

Camel-K Tracing Trait error

Open catshout opened this issue 4 years ago • 7 comments

I did install on a GKE

  • Camel-K 1.6
  • the Jaeger Operator following https://github.com/jaegertracing/jaeger-operator The Jaeger operator starts well, the UI is reachable.

Afterwards I did execute

kamel run --trait tracing.enabled=true --trait tracing.auto=true hello.groovy

hello.groovy looks like

from('timer:tick?period=3000') .setBody().constant('Hello world from Camel K') .to('log:info')

The log of the Camel-K route shows periodically

[1] at io.jaegertracing.thrift.internal.senders.ThriftSender.flush(ThriftSender.java:116)
[1] at io.jaegertracing.internal.reporters.RemoteReporter$FlushCommand.execute(RemoteReporter.java:160)
[1] at io.jaegertracing.internal.reporters.RemoteReporter$QueueProcessor.run(RemoteReporter.java:182)
[1] at java.base/java.lang.Thread.run(Thread.java:829)
[1] Caused by: io.jaegertracing.internal.exceptions.SenderException: Could not send 1 spans
[1] at io.jaegertracing.thrift.internal.senders.UdpSender.send(UdpSender.java:86)
[1] at io.jaegertracing.thrift.internal.senders.ThriftSender.flush(ThriftSender.java:114)
[1] ... 3 more
[1] Caused by: org.apache.thrift.transport.TTransportException: Cannot flush closed transport
[1] at io.jaegertracing.thrift.internal.reporters.protocols.ThriftUdpTransport.flush(ThriftUdpTransport.java:151)
[1] at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73)
[1] at org.apache.thrift.TServiceClient.sendBaseOneway(TServiceClient.java:66)
[1] at io.jaegertracing.agent.thrift.Agent$Client.send_emitBatch(Agent.java:70)
[1] at io.jaegertracing.agent.thrift.Agent$Client.emitBatch(Agent.java:63)
[1] at io.jaegertracing.thrift.internal.senders.UdpSender.send(UdpSender.java:84)
[1] ... 4 more
[1] Caused by: java.net.PortUnreachableException: ICMP Port Unreachable
[1] at java.base/java.net.PlainDatagramSocketImpl.send(Native Method)
[1] at java.base/java.net.DatagramSocket.send(DatagramSocket.java:695)
[1] at io.jaegertracing.thrift.internal.reporters.protocols.ThriftUdpTransport.flush(ThriftUdpTransport.java:149)
[1] ... 9 more```

catshout avatar Sep 14 '21 12:09 catshout

There might be some incompatibility errors. The operator should be able to find the "Jaeger" and configure it.

@catshout did you install Jaeger from master or used a specific version?

nicolaferraro avatar Sep 14 '21 13:09 nicolaferraro

@nicolaferraro I did install Jaeger from master.

catshout avatar Sep 14 '21 14:09 catshout

I think the reason why it does not get discovered is that the example from master creates the Jaeger instance in the "observability" namespace, while the tracing trait expects to find it in the current namesapce.

In these cases, you need to set the tracing endpoint manually, like:

kamel run -t tracing.enabled=true -t tracing.endpoint=http://simplest-collector.observability.svc.cluster.local:14268/api/traces hello.groovy

Or whatever endpoint is your tracer listening to.

I think we should go into a state like "Waiting for Bindings" (wdyt @astefanutti?), where we wait for other resources to be created before launching the integration. Currently we don't signal any error and just let the runtime to randomly log tracing errors.

nicolaferraro avatar Sep 20 '21 15:09 nicolaferraro

I think the reason why it does not get discovered is that the example from master creates the Jaeger instance in the "observability" namespace, while the tracing trait expects to find it in the current namesapce.

We could search in all namespaces in that case, or search for namespaces with specific labels if there is such a convention.

I think we should go into a state like "Waiting for Bindings" (wdyt @astefanutti?), where we wait for other resources to be created before launching the integration. Currently we don't signal any error and just let the runtime to randomly log tracing errors.

I agree it'd be better to act on missing requirements beforehand, rather than letting the runtime fail. We could introduce a Pending phase ("Waiting for Bindings" is going away with #2627), where all the pre-requisites would have to be satisfied before transitioning the Integration to the Deploying / Running phase, and each pre-requisite status would be reported as a condition. The problem I see with this approach is that it's a one time check, and the pre-requisites aren't monitored after that. Ideally, when a pre-requisite happens not to be satisfied anymore while the Integration is running, the operator should transition it to the Error phase, instead of letting it in CrashLoopBackOff state, that we know will fail. Which lead to the point where that Pending phase may not be really relevant, at least necessary, and it might be better to reuse the Error phase anytime a pre-requisite is not satisfied. WDYT?

astefanutti avatar Sep 21 '21 08:09 astefanutti

In my mind a wait or pending doesn't fit into a Micro Service related approach. If something can't be fulfilled I'd prefer an Error state if the reason could be determined. Further it's up to an external deployment or something else to react and retry if the error root cause has been solved.

catshout avatar Sep 21 '21 13:09 catshout

In my mind a wait or pending doesn't fit into a Micro Service related approach. If something can't be fulfilled I'd prefer an Error state if the reason could be determined. Further it's up to an external deployment or something else to react and retry if the error root cause has been solved.

@catshout thanks for the feedback. That's my opinion too, and fits with the approach of continuously reconciling the integration state, possibly transitioning from the running to error phases and vice-versa, each time its context changes.

astefanutti avatar Sep 21 '21 13:09 astefanutti

This issue has been automatically marked as stale due to 90 days of inactivity. It will be closed if no further activity occurs within 15 days. If you think that’s incorrect or the issue should never stale, please simply write any comment. Thanks for your contributions!

github-actions[bot] avatar Apr 21 '22 00:04 github-actions[bot]

Stale issue, please reopen if occurs with newer Camel K versions

squakez avatar Feb 02 '23 12:02 squakez