Logs: BeforeLogs Sync with Native
Description
I created this issue to talk about the best solution to be followed for the following problem.
Problem:
- Users can't filter logs on
beforeLogsthat were created by the native SDKs. - Each SDK has their cache flow logic, having two queues for flushing (one in JavaScript and another on the native side)
Solution 1:
- Move logs from
beforelogon the Native side to the JavaScript and always return null/empty on the native side. - Drop the logs on the native side so we keep a single queue on the JavaScript side.
Problems:
- Native side will send internal logs that logs are being dropped by
beforelog. - Sentry JavaScript exposes
_INTERNAL_captureSerializedLogthat could be used to skip the log processing on the JavaScript SDK, but doesn't expose a way of convertingLogtoserializedLog, requiring code duplication.
Solution 2:
- When
beforeLogis called on the native layer, call the JavaScriptbeforeLogwith the Native log and wait for the JavaScript response with the changes done by the end user.
Problems:
- Doesn't fix the two queues for sending Logs.
- Native layer always having to wait for a response from the JavaScript layer when calling
BeforeLog.
It would be nice to have some feedback on which solution seems like the better path forward, or if there’s a third alternative I might be missing.
Users can't filter logs created by the native SDKs.
Imo the native SDK should emit logs as "react-native". Fwiw, you can filter on these attributes, but it is weird having two different values when you as a customer expect to only interact with the RN SDK. If this causes issue for usage analytics, we should emit SDK data as sentry._internal to not pollute the UI cc @AbhiPrasad
Each SDK has their cache flow logic, having two queues for flushing (one in JavaScript and another on the native side)
Depending on which SDK actually emits the logs to Sentry, I would proxy to it.
JavaScript SDK, but doesn't expose a way of converting
LogtoSerializedLog
I added a callback to _INTERNAL_captureLog in @sentry/core. We use in the Electron SDK so we can use all the existing logic, but then capture the SerializedLog at the last minute and do what we want with it.
https://github.com/getsentry/sentry-javascript/blob/f3f0ba31e7894158d0e9de85989a07cd61d39304/packages/core/src/logs/exports.ts#L117-L122
export function _INTERNAL_captureLog(
beforeLog: Log,
client: Client | undefined = getClient(),
currentScope = getCurrentScope(),
captureSerializedLog: (client: Client, log: SerializedLog) => void = _INTERNAL_captureSerializedLog,
): void {
We then use this in the renderer to capture logs and then send them to the main process: https://github.com/getsentry/sentry-electron/blob/08622caa6d09f7cb88fa6c13b349783834c99904/src/renderer/log.ts#L12-L24
function captureLog(
level: LogSeverityLevel,
message: ParameterizedString,
attributes?: Log['attributes'],
severityNumber?: Log['severityNumber'],
): void {
_INTERNAL_captureLog(
{ level, message, attributes, severityNumber },
getClient(),
getCurrentScope(),
(_: unknown, log: SerializedLog) => getIPC().sendStructuredLog(log),
);
}
We then recapture the logs in the main process: https://github.com/getsentry/sentry-electron/blob/08622caa6d09f7cb88fa6c13b349783834c99904/src/main/ipc.ts#L168-L183
function handleLogFromRenderer(client: Client, options: ElectronMainOptionsInternal, log: SerializedLog): void {
log.attributes = log.attributes || {};
if (options.release) {
log.attributes['sentry.release'] = { value: options.release, type: 'string' };
}
if (options.environment) {
log.attributes['sentry.environment'] = { value: options.environment, type: 'string' };
}
log.attributes['sentry.sdk.name'] = { value: 'sentry.javascript.electron', type: 'string' };
log.attributes['sentry.sdk.version'] = { value: SDK_VERSION, type: 'string' };
_INTERNAL_captureSerializedLog(client, log);
}
_INTERNAL_captureLog
That's a good idea @timfish! I was worried about the timestamp override but then, the Log doesn't expose when it was captured so it should be good. https://github.com/getsentry/sentry-javascript/blob/f3f0ba31e7894158d0e9de85989a07cd61d39304/packages/core/src/logs/exports.ts#L187
@cleptric Sorry I meant filtering at beforeLog not at sentry.io.
Users can't filter logs on beforeLogs that were created by the native SDKs.
We also have this problem for beforeSend and beforeSendTransaction. If I'm not mistaken, we don't have this properly solved for these at the moment. What drives you to solve this now for logs?
Each SDK has their cache flow logic, having two queues for flushing (one in JavaScript and another on the native side)
Are you referring to the envelopes cache or the logs cache / logs batcher, @lucas-zimerman?
FYI, we also need to ensure that logs work for crashes and watchdog terminations for mobile SDKs. We intentionally ignored that for V1 for mobile SDKs. We're currently working on a concept. Currently, it appears that the hybrid SDKs and RN should forward logs to the native SDKs, as we continuously need to store logs on disk to make this work. With that approach, the logs cache would live in the native SDKs and would be gone on hybrid.
As you don't have access to Notion, @lucas-zimerman, here's PDF. After we spec this out, we will open a PR to the develop docs.
Users can't filter logs on beforeLogs that were created by the native SDKs.
We also have this problem for beforeSend and beforeSendTransaction. If I'm not mistaken, we don't have this properly solved for these at the moment. What drives you to solve this now for logs?
True, we don't have beforeSend/beforeSendTransaction filters on the bridge. At the moment, most logs from React Natives are coming from the native side, I thought, since the log structure is simple, it's simple enough to send to the bridge. Would you rather that users implement log filtering on each respective layer? Maybe a simpler alternative would be an ignoreLogs filter, so users won't need to interact with all layers if the only usage of beforeLogs is to only drop certain logs.
Each SDK has their cache flow logic, having two queues for flushing (one in JavaScript and another on the native side)
Are you referring to the envelopes cache or the logs cache / logs batcher, @lucas-zimerman?
The Logs Batcher.
Thank you for pointing to that documentation @philipphofmann!
Would you rather that users implement log filtering on each respective layer?
No, definitely not, cause it's complicated for the user. My main question is not if we want to do this, but rather when.
A somewhat crazy idea, or perhaps not so crazy, would be to store all logs first to disk. Before sending them, we can call all beforeSendLog hooks on every layer in an order yet to be defined. We can do that on a BG thread, so we don't have a serialization overhead on the main thread.