kafka icon indicating copy to clipboard operation
kafka copied to clipboard

KAFKA-7016: Not hide the stack trace for ApiException

Open guozhangwang opened this issue 5 years ago • 9 comments

Unit tests passed locally, do not think it has any compatibility breakage.

Committer Checklist (excluded from commit message)

  • [ ] Verify design and implementation
  • [ ] Verify test coverage and CI build status
  • [ ] Verify documentation (including upgrade notes)

guozhangwang avatar Nov 08 '19 21:11 guozhangwang

@hachikuji

guozhangwang avatar Nov 08 '19 21:11 guozhangwang

I think this is a good change. Probably the only case it may have made sense is some of the broker produce/fetch error handling. It's probably worth taking a look at the server usages to see if we can either replace them with singletons or just not use exceptions in the first place for common cases.

hachikuji avatar Nov 09 '19 00:11 hachikuji

@hachikuji I've checked 1) replica fetcher session handler for consume and 2) txn marker response handler for produce, neither of them is leveraging on raising an ApiException based on the response error code. So I think we actually do not throw ApiException on broker side, or are you referring to different things?

guozhangwang avatar Nov 09 '19 02:11 guozhangwang

@guozhangwang Seem the Jira number is not correct?

mjsax avatar Nov 10 '19 07:11 mjsax

Thanks for pointing it out @mjsax , good catch! Fixed the number.

guozhangwang avatar Nov 11 '19 22:11 guozhangwang

@guozhangwang I'm troubleshooting a broker side issue by following the exception

2021/02/05 04:34:45.302 [ZkReplicaStateMachine] [ReplicaStateMachine controllerId=31862] Controller moved to another broker when moving some replic\as to OfflineReplica state
org.apache.kafka.common.errors.ControllerMovedException: Controller epoch zkVersion check fails. Expected zkVersion = 139

The troubleshooting is very painful without the stacktrace. When do you think we can get this PR merged?

gitlw avatar Feb 09 '21 02:02 gitlw

Can we PLEASE get this merged? It is incredibly difficult to track down problems in code that calls the clients, if we can't tell where the call is made from.

garyrussell avatar Jul 01 '21 20:07 garyrussell

what's up with this PR? it seems easy.

kkolyan avatar Aug 02 '22 09:08 kkolyan

Hello @gitlw @garyrussell @kkolyan sorry for getting late on you. There's a concern which is not recorded on the PR here, which is that printing the stack trace on each ApiException may cause a log flooding on the broker side. @ijuma originally raised it so he may provide some more context.

guozhangwang avatar Aug 03 '22 18:08 guozhangwang

@guozhangwang If that is a concern, how about making it configurable? For example consider the following:

    public static boolean INCLUDE_STACK_TRACES = "true".equals(System.getProperty("kafka.include.stack.traces"));

    /* avoid the expensive and useless stack trace for api exceptions */
    @Override
    public Throwable fillInStackTrace() {
        return INCLUDE_STACK_TRACES ? super.fillInStackTrace() : this;
    }

That would default to the current behaviour, and no log flooding would be possible. On the other hand, people who want stack traces from Kafka APIs could just set the system property "kafka.include.stack.traces" to "true", and they'd get them – whereas now, they'd have to patch the Kafka client, or wrap every call to Kafka with try-catch and throw a new exception with the stack included. (I suggest not making it final, so people can still change the value at runtime.)

I would also really like to see some solution to this merged. Not having stack traces makes diagnosing issues harder.

skissane avatar Dec 01 '22 05:12 skissane