activepieces icon indicating copy to clipboard operation
activepieces copied to clipboard

Strucuted Logging & Step Logging

Open abuaboud opened this issue 9 months ago • 2 comments

Problem Statement

The problem is searching through runs using full-text search to locate specific IDs or failures, or to make sure activepieces is functioning in the infrastructure level.

This feature is meant for self hosters and not on the cloud

Describe Solution

I am considering leveraging an existing solution in the developer market, which involves exposing logs as structured JSON. This would enable developers to query them freely as they would query JSON.

Data Structure

I propose mapping each step to a single log entry with structured logs. The JSON structure would resemble this:

{
  "status": "SUCCESS",
  "input": {},
  "output": {},
  "step": {
    "name": "ExampleStep"
  },
  "flow": {
    "id": "flow123",
    "versionId": "v1",
    "displayName": "ExampleFlow"
  }
}

Structured Logging

We should apply structured logging for all application where each log should contain a unique key to be queried by (fileName#method) for example, and a timestamp and meta information related to the logs.

Suggested Approach.

For self hosters, just send them through pino transport to external logging service / print them in console.

Services like: Grafana / Datadog

Other Options

Option 2: Provide in house solution and visual logging interface where logs can be searched

Option 3: Both of options, where simple one in-house and complex one use external logging service.

I would love to know what everyone thinks of this feature and which option is your favorite. How does the logging structure look?

abuaboud avatar May 09 '24 19:05 abuaboud

Super exciting thanks! In our case, we're using Datadog to centralise all logs (and build monitors, alerts, etc.) - so I strongly favour option 1 (or 3) 😄

Do you confirm this will be (eventually) for all logs, not "just" flow execution logs? Today it seems we have a mix of various formats (JSON, key value, un-structured) making it hard to exploit the logs. But even if it's only for runs at first, it will be very valuable

Re. the structure, at first glance it's seems OK - The key will be to (gradually) add useful context wherever we can

  • In your example, I would add a runId to be able to correlate all logs for a specific run, etc.
  • You probably also need a generic message attribute, as well as an error (or similar) for error codes / stack traces

But this is probably something we can iterate on

AdamSelene avatar May 13 '24 19:05 AdamSelene

We are going to develop this feature primarily for self-hosters, as it observes the infrastructure level rather than the end user.

Here’s what we are going to do:

  • Change all logging to be structured.
  • Ensure all logs contain a timestamp, a unique key for the log location, and the log level.
  • Log step-by-step in flow runs so they can be queried.

For integrations, we assume you can connect the container logs to any third-party service for searching, such as Datadog.

abuaboud avatar May 28 '24 14:05 abuaboud