fedify icon indicating copy to clipboard operation
fedify copied to clipboard

Extensible federation lifecycle observability

Open dahlia opened this issue 5 months ago • 4 comments

Summary

This proposal suggests adding a FederationObserver interface to provide extensible hooks into the federation lifecycle, with the primary use case being a debug dashboard (as discussed in #319) but designed to support other observability and extensibility needs.

Motivation

While reviewing PR #319 (Real-time ActivityPub debug dashboard), we identified that the current approach requires integration hooks in the core federation system to capture real ActivityPub traffic. Rather than adding debugger-specific code to the core @fedify/fedify package, we can introduce a general observer pattern that:

  1. Enables the debug dashboard without coupling debugger code to the core package
  2. Provides extensibility for other use cases like logging, metrics, analytics, security auditing
  3. Maintains separation of concerns keeping the core package focused on federation functionality
  4. Optimizes bundle size by allowing optional feature packages

Proposed API

Core Interface (in @fedify/fedify)

interface FederationObserver<TContextData> {
  onInboundActivity?(context: Context<TContextData>, activity: Activity): void |  Promise<void>;
  onOutboundActivity?(context: Context<TContextData>, activity: Activity): void | Promise<void>;
}

interface FederationOptions<TContextData> {
  // ... existing options
  observers?: FederationObserver<TContextData>[];
}

Debug Implementation (in separate @fedify/debugger package)

export class DebugObserver<TContextData> implements FederationObserver<TContextData> {
  constructor(private options: { path?: string } = {}) {}
  
  onInboundActivity(context: Context<TContextData>, activity: Activity) {
    this.store.addActivity({
      direction: 'inbound',
      activity,
      timestamp: new Date()
    });
  }
  
  onOutboundActivity(context: Context<TContextData>, activity: Activity) {
    this.store.addActivity({
      direction: 'outbound', 
      activity,
      timestamp: new Date()
    });
  }
  
  // Additional methods for serving debug dashboard UI
}

Usage

import { DebugObserver } from '@fedify/debugger';

const debugObserver = new DebugObserver({ path: '/__debugger__' });
const federation = createFederation({
  kv: new MemoryKvStore(),
  observers: [debugObserver],
});

Integration Points

The observers would be called at strategic points in the federation middleware:

  1. Inbound activities: In handleInbox after activity parsing but before listener execution
  2. Outbound activities: In sendActivity before activity transformation and delivery

Benefits

  1. Decoupled design: Debug functionality lives in separate package
  2. Extensible: Can support logging, metrics, filtering, security scanning, etc.
  3. Async support: Unlike current ActivityTransformer, observers can perform async operations
  4. Bundle optimization: Production builds don't include debug code unless imported
  5. Multiple observers: Can register multiple observers for different purposes

Future Extensions

While starting minimal for the debug use case, the interface could be extended with additional hooks:

interface FederationObserver<TContextData> {
  // Current proposal
  onInboundActivity?(context: Context<TContextData>, activity: Activity): void | Promise<void>;
  onOutboundActivity?(context: Context<TContextData>, activity: Activity): void | Promise<void>;
  
  // Potential future additions
  onActorRequest?(context: Context<TContextData>, actor: Actor): void | Promise<void>;
  onCollectionRequest?(context: Context<TContextData>, collection: Collection): void | Promise<void>;
  onWebFingerRequest?(context: Context<TContextData>, resource: string): void | Promise<void>;
}

Relationship to PR #319

This proposal would enable the debug dashboard from #319 to be implemented as:

  1. A @fedify/debugger package that implements FederationObserver
  2. Integration with the federation router to serve debug UI at configurable paths
  3. Real-time activity capture without modifying core federation logic

The excellent work in #319 (ActivityStore, WebSocket updates, terminal interface, etc.) would be reused in this new architecture.

Implementation Plan

  1. Add FederationObserver interface to core package
  2. Add observers option to FederationOptions
  3. Integrate observer calls in federation middleware
  4. Create @fedify/debugger package using work from #319
  5. Update documentation and examples

dahlia avatar Jul 23 '25 08:07 dahlia

For the future additions, I'd also suggest hooks for when Fedify makes an outbound request for an actor or document or webfinger.

ThisIsMissEm avatar Jul 23 '25 21:07 ThisIsMissEm

You could probably also inverse the control pattern here, by doing something like:

const federation = createFederation({
  kv: new MemoryKvStore(),
});

const debug = new DebugObserver({ path: '/__debugger__' });
debug.observe(federation)

Where under the hood we just emit specific events using standard event emitter pattern from various components.

ThisIsMissEm avatar Jul 23 '25 21:07 ThisIsMissEm

I'll handle this in #319

notJoon avatar Jul 26 '25 08:07 notJoon

Alternative approach: leverage OpenTelemetry infrastructure

I've been thinking about this proposal further, and I realized that Fedify already has extensive OpenTelemetry instrumentation throughout the codebase. This makes me wonder if we could leverage OpenTelemetry spans and events instead of introducing a new FederationObserver interface.

Current OpenTelemetry coverage

Looking at the current code, key federation operations are already instrumented with OpenTelemetry spans. The inbox handler creates an activitypub.inbox span (handler.ts:577), and we're already recording activity IDs, types, and recipients as span attributes (handler.ts:795-797). Similar instrumentation exists for outbound delivery in send.ts and collection handling with item counts.

For example, the current inbox handling already does this:

tracer.startActiveSpan("activitypub.inbox", async (span) => {
  span.setAttribute("activitypub.activity.id", activity.id.href);
  span.setAttribute("activitypub.activity.type", getTypeId(activity).href);
  span.setAttribute("fedify.inbox.recipient", recipient);
  // ...
});

Alternative approach: OpenTelemetry-based observability

Instead of creating a new observer interface, we could enhance the existing OpenTelemetry instrumentation in two ways.

First, we could add richer data to existing spans using span events. While span attributes are limited to primitive values, span events can carry more complex data. We could record full activity payloads and verification results like this:

// In handleInbox
span.addEvent("activitypub.activity.received", {
  "activitypub.activity.json": JSON.stringify(activity),
  "activitypub.activity.verified": verified,
  "http_signatures.verified": signatureVerified,
  "http_signatures.key_id": keyId?.href
});

// In sendActivity  
span.addEvent("activitypub.activity.sent", {
  "activitypub.activity.json": JSON.stringify(activity),
  "activitypub.inboxes": recipients.length,
  "activitypub.inbox.url": inbox.href
});

Second, we could add instrumentation to areas that currently lack it, such as WebFinger lookups, object fetching via lookupObject, HTTP signature verification details, and document loader operations. For instance:

// In vocab/lookup.ts
export async function lookupObject(url: URL, options) {
  return tracer.startActiveSpan("activitypub.lookup.object", async (span) => {
    span.setAttribute("activitypub.object.url", url.href);
    try {
      const object = await /* ... */;
      span.addEvent("activitypub.object.fetched", {
        "activitypub.object.type": getTypeId(object).href,
        "activitypub.object.json": JSON.stringify(object)
      });
      return object;
    } catch (error) {
      span.recordException(error);
      throw error;
    }
  });
}

Debug dashboard implementation

For the debug dashboard use case from PR #319, we could implement @fedify/debugger as a custom SpanExporter. This exporter would receive all the spans generated by Fedify's federation code and extract ActivityPub-specific information from them:

import type { SpanExporter, ReadableSpan } from '@opentelemetry/sdk-trace-base';
import { ExportResultCode } from '@opentelemetry/core';

export class FedifyDebugExporter implements SpanExporter {
  private activityStore = new ActivityStore();
  
  export(spans: ReadableSpan[], resultCallback: (result) => void): void {
    for (const span of spans) {
      if (span.name === 'activitypub.inbox' || span.name === 'activitypub.outbox') {
        const activityEvent = span.events.find(e => 
          e.name === 'activitypub.activity.received' || e.name === 'activitypub.activity.sent'
        );
        
        if (activityEvent) {
          this.activityStore.add({
            direction: span.name.includes('inbox') ? 'inbound' : 'outbound',
            activity: JSON.parse(activityEvent.attributes?.['activitypub.activity.json'] as string),
            timestamp: new Date(span.startTime[0] * 1000 + span.startTime[1] / 1000000),
            verified: activityEvent.attributes?.['activitypub.activity.verified'] as boolean,
          });
        }
      }
    }
    resultCallback({ code: ExportResultCode.SUCCESS });
  }
  
  async forceFlush(): Promise<void> {
    // Flush any pending data
  }
  
  async shutdown(): Promise<void> {
    // Clean up resources
  }
  
  handleRequest(request: Request): Response {
    // Serve debug UI with WebSocket updates
  }
}

Integration would be straightforward since createFederation already accepts a tracerProvider option:

import { FedifyDebugExporter } from '@fedify/debugger';
import { NodeTracerProvider } from '@opentelemetry/sdk-trace-node';
import { SimpleSpanProcessor } from '@opentelemetry/sdk-trace-base';

const debugExporter = new FedifyDebugExporter({ path: '/__debugger__' });
const tracerProvider = new NodeTracerProvider();
tracerProvider.addSpanProcessor(new SimpleSpanProcessor(debugExporter));

const federation = createFederation({
  kv: new MemoryKvStore(),
  tracerProvider,
});

Benefits of this approach

This approach has several advantages over introducing a new observer interface. Most of the infrastructure already exists, so we're building on roughly 90% of existing code rather than creating something new. OpenTelemetry is a CNCF standard with a massive ecosystem, meaning this works immediately with existing tools like Jaeger, Zipkin, Grafana Tempo, and Datadog without any additional integration work.

The OpenTelemetry model already solves many of the design questions that would come up with a custom observer interface. Error handling is built-in through span.recordException(), execution semantics are well-defined by the OpenTelemetry specification, async operations are natively supported, and performance concerns are addressed through built-in sampling and batching. Multiple observers are handled through multiple SpanProcessors, and context propagation for distributed tracing comes for free.

From an API design perspective, we don't need to introduce any new interfaces to the core @fedify/fedify package. The tracerProvider option already exists, and everything else happens through standard OpenTelemetry APIs.

Addressing other use cases

Beyond the debug dashboard, this approach naturally extends to other observability scenarios. For logging, metrics, and analytics, developers can use standard OpenTelemetry exporters like OTLP or Prometheus exporters without writing custom code. Security auditing can be implemented by exporting to a SIEM system via an OpenTelemetry collector, with full context about signatures and verification results available in the spans.

Future extensibility is straightforward: we just add spans to new federation operations as we develop them, and the community can build custom exporters for their specific needs.

Potential concerns

One concern might be that span attributes only support primitive values, but this is why we'd use span.addEvent() for complex data like activity payloads. Another concern might be that we need an ActivityPub-specific UI rather than generic tracing visualizations, but that's exactly what FedifyDebugExporter provides by transforming spans into an ActivityPub-focused dashboard.

Some instrumentation is currently missing for operations like WebFinger and object lookups, but we should probably add this anyway for general observability purposes, regardless of the debug dashboard.

Recommendation

I'm now leaning toward enhancing the OpenTelemetry instrumentation as the primary observability mechanism and implementing @fedify/debugger as a SpanExporter to achieve the debug dashboard goals. This would let us skip the FederationObserver interface entirely unless we discover something that OpenTelemetry can't handle, which seems unlikely.

This gives us a more powerful, standards-based solution with less code to maintain and better ecosystem integration. Does this make sense, or am I overlooking something important that would still make FederationObserver necessary?

dahlia avatar Nov 22 '25 12:11 dahlia