fedify Interoperability smoke test suite

Summary

As a server framework, Fedify's core value lies in its ability to correctly interoperate with other ActivityPub implementations in the Fediverse. Currently, we rely on unit tests and manual testing, but we lack an automated, systematic way to verify E2E interoperability, similar to Node.js's CITGM (canary in the gold mine).

This issue proposes creating a new CI workflow dedicated to running smoke tests against live instances of major ActivityPub servers (e.g., Mastodon, Misskey) to ensure our federation logic is robust and compatible.

Proposed solution

The plan involves three main components:

CI workflow: A GitHub Actions workflow that uses docker-compose to spin up services for:
- One or more target ActivityPub servers (e.g., a Mastodon instance).
- Our “Fedify test harness” application.
Fedify test harness: Since Fedify is a library, we will create a minimal, lightweight Fedify application within this repository (e.g., under test/smoke-harness/). Its sole purpose is to serve as an endpoint for these tests.
CI orchestrator: The main test script (e.g., a Deno script) that orchestrates the E2E test.

Implementation details

This E2E test cannot only test Fedify. It must verify that actions are correctly sent, received, and interpreted by both sides.

1. Fedify test harness app

It will be a minimal fedify app with basic handlers (e.g., for Actor, Inbox, Outbox).
It will use a simple data store (e.g., in-memory or Deno KV).
Crucially, it will expose internal “backdoor” APIs for the test orchestrator (e.g., POST /_test/follow, POST /_test/create-note, GET /_test/get-latest-inbox-item).

2. CI orchestrator and verification

The orchestrator script will manage the entire test flow by communicating with both our test harness and the target server's API.

Example scenario: Fedify → Mastodon (`Create(Note)`)

Setup: The orchestrator uses Mastodon's API (tootctl or REST) to create a test user (@[email protected]) and get an API token.
Action: The orchestrator calls our harness's backdoor API: POST /_test/create-note?content=...
Federation: Our harness app, using fedify, sends a Create activity to @mastodon-user's inbox.
Verification: The orchestrator uses the Mastodon API token to poll @mastodon-user's home timeline (GET /api/v1/timelines/home).
Assert: The test passes if the new note from our Fedify harness appears on the Mastodon user's timeline within a timeout.

Example scenario: Mastodon → Fedify (reply)

Action: The orchestrator uses the Mastodon API to post a reply to the note from the previous test.
Federation: The Mastodon server sends a Create (reply) activity to our Fedify harness's inbox.
Verification: The orchestrator calls our harness's backdoor API: GET /_test/get-latest-inbox-item.
Assert: The test passes if the harness returns the reply activity from Mastodon.

CI strategy

These tests will be too long-running and resource-intensive to run on every PR.
They should be configured to run on pushes to:
- main
- next
- Maintenance branches (e.g., *.*-maintenance)
We will also add a workflow_dispatch trigger to allow them to be run manually on a specific PR branch when necessary (e.g., when federation code is changed).

Phased rollout (target implementations)

We will add implementations gradually.

Phase 1 (Core microblogging):
- Mastodon (De-facto standard)
- Misskey (Major alternative with different characteristics)
Phase 2 (Major stacks & types):
- Akkoma/Pleroma (Elixir-based)
- Pixelfed (Media-focused, Image/Video objects)
Phase 3 (Service diversity):
- PeerTube (Video / Group actor for channels)
- Lemmy / Kbin (Community / Group interaction)
- WriteFreely (Article objects)

Acceptance criteria (for this task)

A CI workflow is created.
A minimal Fedify test harness app is built within the repo.
The workflow successfully runs E2E tests (e.g., Follow, Create(Note), reply) against Mastodon.
The workflow is configured to run on pushes to main, next, and *.*-maintenance branches, and on workflow_dispatch.

Nov 07 '25 11:11 dahlia

Have you seen https://pasture.funfedi.dev/, via https://nlnet.nl/project/FediverseTestFramework/ ?

Nov 08 '25 07:11 nikclayton

@nikclayton Thanks for the pointer! It looks really useful for this—especially the pre-configured Docker containers for Mastodon, Misskey, and other fediverse applications, which would save us significant setup time. The actor verification tools could also help validate our test harness implementation, and the support tables might guide us on which ActivityPub object variations to prioritize testing. I'll definitely look into integrating Pasture's components into our smoke test suite design.

Nov 10 '25 04:11 dahlia

As the author of https://pasture.funfedi.dev/, I have some useful comments and mea-culpas

the pasture is at the state: works reliably on my machine. This means, I can update the containers https://containers.funfedi.dev/ on a regular basis, and the thing stays working.
My impression from my fellow developers is that the compose files might not work with certain other setups. (fedora + podman, apple with arm chips). I consider these hard issues to solve. I do not have the test hardware or infrastructure lying around to fix.
I suggest starting integration tests with mitra. It has relatively sane error messages, and is light weight to run, and does not require patches.
I'm happy to cooperate if there are tasks for me to do on the infrastructure.

On the other side of the coin: I've submitted a proposal to nlnet that would take the data from the support tables https://funfedi.dev/support_tables/ and use them to determine a rule set what a parser ActivityPub -> application format should support. If this gets done, E3E tests become pretty much obsolete. Following this rule set will be enough.

Nov 10 '25 09:11 HelgeKrueger

@HelgeKrueger Thanks for the detailed insights and the offer to collaborate! I appreciate the transparency about the current state.

I'm actually running Fedora + Podman myself, so I might encounter some of those setup issues you mentioned; if I do, I'd be happy to contribute fixes back to the project. Starting with Mitra sounds like a great suggestion given its sane error messages and lightweight nature.

Your NLnet proposal for deriving parser rule sets from the support tables sounds really promising; it would indeed make interoperability testing much more systematic. I'll definitely reach out when we get to the implementation phase of this smoke test suite.

Nov 10 '25 10:11 dahlia

Interoperability smoke test suite

Summary

Proposed solution

Implementation details

1. Fedify test harness app

2. CI orchestrator and verification

Example scenario: Fedify → Mastodon (Create(Note))

Example scenario: Mastodon → Fedify (reply)

CI strategy

Phased rollout (target implementations)

Acceptance criteria (for this task)

Example scenario: Fedify → Mastodon (`Create(Note)`)