jdk
jdk copied to clipboard
8325465: JFR: Context filtering
A preliminary PoC work for JFR contextual event
(Based on initial discussion in https://github.com/skogsluft/jdk-skogsluft/discussions/4)
Contextual Events
Contextual events, as the name suggests, are to be used to provide additional context to other events. Examples might be tracing context in a distributed tracing, transaction context, work unit identification etc.
A contextual event is thread specific - it will provide context only to events committed on the same thread,
after the contextual event was started (begin()
method) but before it has finished (end()
method).
There may be multiple contextual events active for a thread. However, they must form a stack - eg. if an event CtxA is opened before event CtxB they must be closed in reverse order, first closing CtxB and only then CtxA. In case the events are crossing each other the behaviour is not defined.
Design
Contextual annotation
A contextual event will be demarked by @Contextual
annotation. This annotation wil be a simple indication
that this particular event type is supposed to provide context to other events and tooling can handle it as such.
All custom fields of such annotated event type will then constitute the context.
Context driven behaviour
Although having the @Contextual
annotation will allow the tooling to associate the context with other
JFR events, there are more ways they can be utilized.
Conditionally emit events
The contextual events can be used to guard production of events which are too costly to emit unconditionally
and using the durational thresholds would introduce too strong bias. An example would be JavaMonitorWait
event.
If left unchecked, the emission rate of JavaMonitorEvent
can overwhelm the recording. What's worse is that
the majority of the recorded events will provide very little additional information. Turning on the durational
threshold will improve the situation, but will introduce bias where the JFR will not be able to point out too much
time spent waiting on a lock, if each wait is shorter than the threshold. In addition to that, this event type
might be frequently emitted from thread pools where threads are just waiting for work.
If the emission is bound to the presence of a context (contextual event) which will be activated only when an important work (what is important work will usually be defined by the user) is being done, providing laser focus on fine-grained details of the application's behaviour.
Record only activated context
We are talking about an activated context (contextual event) when there is at least one other event committed
on the same thread between calling begin()
and end()
of the contextual thread. We can also think about
the context being 'triggered' by the regular events.
The concept of 'active' context is beneficial in lowering the overhead related to recording the context - eg. for the distributed tracers with context propagation it is possible to generated millions of contextual events per minute for certain frameworks (async and reactive ones are pretty notorious). This creates a huge pressure both when the recording is written and also when it needs to be processed. And most of these events will be literally useless because there would be no events the context could be applied to.
Controlling the behaviour via settings
The proposal is to use the standard JFR event settings mechanism to affect the behaviour of both contextual and regular events.
There will be a new setting called select
and the following permitted values:
-
if-context
- the regular event will be emitted only if a context is present -
if-triggered
- the contextual event will be emitted only if the context is triggered -
all
- no context related restrictions are applied
The if-context
option is valid only for non-contextual events.
The if-triggered
option is valid only for contextual events.
The all
option is valid for any event.
If an invalid option is provided, JFR will log a warning and the setting will be set to all
.
The select
setting is to be used in conjunction with other filtering mechanisms, like threshold
.
Implementation
@Contextual
annotation
The annotation implementation is pretty straightforward and there is nothing special going on there.
Activated context
In order to support selective emission of the contextual events only when they are activated the event
class must be instrumented and a synthetic field named ^ctxOffset
must be inserted there.
The field is used to track the number of events written while this context is open. The actual number does not matter, we just need to make sure we can tell there is at least one written event.
This information is then used in the shouldCommit()
method of the contextual event type which needs
to be changed to consult ^ctxOffset
field and return false if that field is 0
. That is, if the
event's settings contains select=if-triggered
. Otherwise, the behaviour of shouldCommit()
is not
affected.
The ^ctxOffset
field is updated from EventWriter
, incrementing it on a new event commit.
Filtered events
A new SelectorSetting
which will check for the presence of context before committing an event with
this setting set to if-context
value. The context presence is determined by the 'context count' per
thread. Each time a context event calls begin()
the count is incremented and then decremented on call
to end()
.
This is available both for the built-in (native) and user defined JFR events, as long as they are not periodic. The periodic events are not really feasible to use with the selector because they are committed on a dedicated thread which should not be using any conditional contexts.
Progress
- [ ] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
- [ ] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
Issue
- JDK-8325465: JFR: Context filtering (Enhancement - P3)
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18689/head:pull/18689
$ git checkout pull/18689
Update a local copy of the PR:
$ git checkout pull/18689
$ git pull https://git.openjdk.org/jdk.git pull/18689/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 18689
View PR using the GUI difftool:
$ git pr show -t 18689
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18689.diff