activepieces
activepieces copied to clipboard
Strucuted Logging & Step Logging
Problem Statement
The problem is searching through runs using full-text search to locate specific IDs or failures, or to make sure activepieces is functioning in the infrastructure level.
This feature is meant for self hosters and not on the cloud
Describe Solution
I am considering leveraging an existing solution in the developer market, which involves exposing logs as structured JSON. This would enable developers to query them freely as they would query JSON.
Data Structure
I propose mapping each step to a single log entry with structured logs. The JSON structure would resemble this:
{
"status": "SUCCESS",
"input": {},
"output": {},
"step": {
"name": "ExampleStep"
},
"flow": {
"id": "flow123",
"versionId": "v1",
"displayName": "ExampleFlow"
}
}
Structured Logging
We should apply structured logging for all application where each log should contain a unique key to be queried by (fileName#method) for example, and a timestamp and meta information related to the logs.
Suggested Approach.
For self hosters, just send them through pino transport to external logging service / print them in console.
Services like: Grafana / Datadog
Other Options
Option 2: Provide in house solution and visual logging interface where logs can be searched
Option 3: Both of options, where simple one in-house and complex one use external logging service.
I would love to know what everyone thinks of this feature and which option is your favorite. How does the logging structure look?
Super exciting thanks! In our case, we're using Datadog to centralise all logs (and build monitors, alerts, etc.) - so I strongly favour option 1 (or 3) 😄
Do you confirm this will be (eventually) for all logs, not "just" flow execution logs? Today it seems we have a mix of various formats (JSON, key value, un-structured) making it hard to exploit the logs. But even if it's only for runs at first, it will be very valuable
Re. the structure, at first glance it's seems OK - The key will be to (gradually) add useful context wherever we can
- In your example, I would add a
runId
to be able to correlate all logs for a specific run, etc. - You probably also need a generic
message
attribute, as well as anerror
(or similar) for error codes / stack traces
But this is probably something we can iterate on
We are going to develop this feature primarily for self-hosters, as it observes the infrastructure level rather than the end user.
Here’s what we are going to do:
- Change all logging to be structured.
- Ensure all logs contain a timestamp, a unique key for the log location, and the log level.
- Log step-by-step in flow runs so they can be queried.
For integrations, we assume you can connect the container logs to any third-party service for searching, such as Datadog.