community icon indicating copy to clipboard operation
community copied to clipboard

Project Proposal: Audit Logging SIG

Open mlenkeit opened this issue 1 year ago • 20 comments

This PR contains a project proposal for an Audit Logging SIG as discussed on Slack.

We are aware that the project proposal still has several tbd's especially with regard to staffing and timeline that need to be defined before the SIG can start working.

We will approach other vendors directly with this proposal to identify additional contributors. Of course, anyone who comes across this proposal here on GitHub is invited to contribute.

While we do have some ideas about a potential timeline for semantic conventions, OTEL SDK/API and collector adjustments respectively, we would like to align this with other contributors first before publishing.

Any feedback from the community on the proposed scope of the SIG is highly appreciated!

Open topics

The following items reference topics from the PR discussion that are still open:

  • immutability, tamper-proof logs, signing - in scope?
    • https://github.com/open-telemetry/community/pull/2409#issuecomment-2438469592
    • https://github.com/open-telemetry/community/pull/2409#issuecomment-2485820262
    • https://github.com/open-telemetry/community/pull/2409#issuecomment-2486409314

mlenkeit avatar Oct 24 '24 10:10 mlenkeit

CLA Signed

The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: mlenkeit / name: Maximilian Lenkeit (65ae32e7b31d2628f345ad2f56396df0c6f7821a, 5094fb10c91150f7ec56bb3578f27fb59926551d, 3876a315eaad74539bc52110f8a430a985acdb55, 776b821f8aebcffc8ce41e882375dce0a1e6d430, d7e265f5eec18374dc3015ba0dd830d0b72e9601, a5ef343bafcd9141c3f41f10625fb8e336da1cbb, 75f2c579dfc4b897a9eb91fd2e1575c987e62e2c, 087865c139a17be07204a5784279cfe2c7d4e01c, 2ec002d8de4ce7a74759ba80310837523edeb85b, 9337b7f9df3bd9bf3835dbf3ef53f81a0392f1a7, 70cbac40fad98ce67683de7fbf960c12c3beaecb, 711dc46f2cd64bafdfa3982c5ba832c616486ba3, f81c2f44c5e5142169f44d542298189c1f7ccf54, 0adb8e5933af2929e0390668f637089111a318e3, a6b34f14bb4582e9bddd4eeec3212be35569e8aa, 066501b92566ce770eb79f5633d2c1e5166aa9a5, 6bc9a5e1617d515da371aeb5b6a196876b588ccd, 8b38626a6c272862e9c4c3204799b2a5939f9d86, 6dd519d9861a088c816bb8a1d10642555c5a6090, e03dcd8ab270573957684254a7d130786783e8fc, 41329f3dba6f4d2dd14817ecec12733c047201a1, 405ddb540f0628577d6be167eeec28827b8d6681, 86bd77a9f06fa7c990265faa6da2932da6d96890)
  • :white_check_mark: login: hilmarf / name: Hilmar Falkenberg (2f9813f4e0e62f645bd1f62d45203f3442189a0e, 03d1de1d49bec649b889ebca5e463f1b27d5f6f1)

Are there any requirements around signing logs / detecting tampering? I've heard that mentioned before in the context of audit logs, but I don't know how common of a requirement it is

mtwo avatar Oct 25 '24 18:10 mtwo

Are there any requirements around signing logs / detecting tampering? I've heard that mentioned before in the context of audit logs, but I don't know how common of a requirement it is

@mtwo for all I know, immutability of audit logs is a common requirement although not all audit logging systems/use cases that I've seen address this requirement with technical measures but sometimes also organizational measures. However, given the flexibility of OTel processing queues (i.e. different topologies of collectors), having a technical solution in OTel would be favorable.

@reyang what is your opinion on this?

mlenkeit avatar Nov 19 '24 14:11 mlenkeit

Are there any requirements around signing logs / detecting tampering? I've heard that mentioned before in the context of audit logs, but I don't know how common of a requirement it is

@mtwo for all I know, immutability of audit logs is a common requirement although not all audit logging systems/use cases that I've seen address this requirement with technical measures but sometimes also organizational measures. However, given the flexibility of OTel processing queues (i.e. different topologies of collectors), having a technical solution in OTel would be favorable.

@reyang what is your opinion on this?

@mlenkeit I think this can be achieved as long as OpenTelemetry is designed to allow additive changes, doesn't have to be there in the first place. I personally haven't seen people signing logs, and I've seen lots of cases where immutable data path is used.

reyang avatar Nov 19 '24 18:11 reyang

Are there any requirements around signing logs / detecting tampering? I've heard that mentioned before in the context of audit logs, but I don't know how common of a requirement it is

@mtwo for all I know, immutability of audit logs is a common requirement although not all audit logging systems/use cases that I've seen address this requirement with technical measures but sometimes also organizational measures. However, given the flexibility of OTel processing queues (i.e. different topologies of collectors), having a technical solution in OTel would be favorable. @reyang what is your opinion on this?

@mlenkeit I think this can be achieved as long as OpenTelemetry is designed to allow additive changes, doesn't have to be there in the first place. I personally haven't seen people signing logs, and I've seen lots of cases where immutable data path is used.

I think the key parts are encryption at REST e.g. when a buffer writes to disk and encryption at TRANSPORT. Where it is getting tricky is when we have to separate audit log data and meta/transport data. In some cases this might lead to duplication. E.g the K8s cluster name that triggered the audit event could be part of the audit log message. In this case it must be immutable. But it is also an OTLP attribute and as such could be changed. Or to phrase it differently, is immutability required for the whole signal or just the message?

renewelches avatar Nov 21 '24 20:11 renewelches

Releasing for review as per @reyang's (offline) suggestion. I'm aware that there's open tbd's that we still need to fill.

mlenkeit avatar Nov 25 '24 17:11 mlenkeit

any updates / progress on this?

svrnm avatar Jan 13 '25 10:01 svrnm

any updates / progress on this?

@svrnm we presented the proposal in the spec SIG on Dec 3, 2024 and sparked a lively discussion. Together with @reyang, we decided to collect some more early feedback from different SIGs especially on the aspect of delivery guarantees. Over the next three weeks, I'm presenting the topic in a few language-specific SIGs and in the Collector SIG. I'll share updates here afterwards.

mlenkeit avatar Jan 16 '25 13:01 mlenkeit

Status update:

  1. we've presented the proposal in the Collector, Java and JS SIG to collect more general feedback; this was rather positive
  2. we are in the process of filling the staffing gaps in the SIG proposal
    • we're trying to fill the open engineering positions from the interested vendors
    • for maintainers/approvers, we'll reach out to community members directly starting with those who engaged in the discussions from 1)

mlenkeit avatar Feb 24 '25 09:02 mlenkeit

We're proposing @hilmarf as the project lead; SIG proposal has been updated accordingly. Still working on getting names for additional engineers and maintainers/approvers.

mlenkeit avatar Mar 27 '25 12:03 mlenkeit

Following the discussions at KubeCon London, we've reconsidered our approach and are now following a phased approach:

  • In Phase 1 (in progress), we are building an end-to-end prototype to refine the challenges and requirements for audit logging in OTel and to showcase potential solutions. This is time-boxed until end of September 2025. We are set up to run this without a formal OTel project sign-off. We consider this truly as a proof of concept, i.e. we don't expect that the OTel modifications from the PoC will be accepted as-is and we are prepared to discard them if necessary.
  • In Phase 2, we intend to contribute functional extensions upstream back to OTel. We will work towards signing off this project proposal and either join existing SIGs or form a separate one. The results from Phase 1 should help us in the discussions with the maintainers to make our proposed OTel extensions/changes more tangible.
  • In Phase 3, we plan to work on semantic conventions for audit logging.

We have just started Phase 1 with @hilmarf and additional contributors from our side. While there isn't much yet, all our prototype efforts will be available at apeirora/audit-log-poc-for-otel.

Towards the end of Phase 1, we'll reach out to the respective SIGs to demonstrate the results of the prototype and get additional support for Phase 2 and this project proposal.

We'll update the project proposal with more specific deliverables for Phase 2 as we gain more insights during Phase 1.

mlenkeit avatar May 20 '25 14:05 mlenkeit

For those interested, I'm building a new startup in this space, specifically LLM observability for audit and compliance. Read more here: https://traceprompt-web.pages.dev/

paulmbw avatar May 22 '25 15:05 paulmbw

@mlenkeit can you provide an update on this proposal, if you'd like to proceed with it or where thing stand on your end?

svrnm avatar Sep 29 '25 09:09 svrnm

Hi @svrnm,

Yes — we would like to proceed.

We created a small proof-of-concept repository: https://github.com/apeirora/audit-log-poc-for-otel containing tests and initial findings.

Status and focus

  • @mlenkeit added an overview of possible deployment topologies (see PR/asset). We plan to focus on Type 6 and Type 7a because they support persistence and help avoid lost logs.
  • Client-side proposals:
    • A named ContextKey to keep optional error callbacks traceable (PR).
    • A LogRecordProcessor that uses an AuditLogStore interface for local persistence until the collector confirms receipt (PR).
  • Receiver-side work:
    • PRs ( https://github.com/apeirora/opentelemetry-collector-contrib/pull/2 and https://github.com/apeirora/opentelemetry-collector/pull/3 ) address persistence on the receiver, which is responsible for confirming delivery to clients.

We’re still iterating and would like to kick-start a SIG. Contributions, feedback, and reviews are very welcome — please comment on the linked PRs or open issues/PRs in the repository.

Thanks!

hilmarf avatar Sep 29 '25 11:09 hilmarf

Thanks for sharing, @hilmarf, I'll need some time to take a closer look!

svrnm avatar Oct 07 '25 11:10 svrnm

Hi @mlenkeit, @hilmarf,

at the last two meetings of the Governance Commitee meetings (recording 1, recording2) we talked about open project proposals and how they align with the current goals and priorities of the project. And while "audit logging" is in scope of our project, we are currently not able to accept this proposal&formation of a new SIG, which requires resources to be assigned from our end as well. We are currently trying to focus on wrapping up and stabilizing current commitments before accepting new projects, we work on a blog post related to that, which draft you can read here: https://github.com/open-telemetry/opentelemetry.io/pull/8208

From our point of view the best way forward is, that you run this project stand alone, in close contact with us, and when things have evolved, come back joining the efforts.

We hope you understand where we are coming from and apologies for saying "no" to your valuable idea. We will keep this issue open for a little longer to answer any questions you might have.

Thanks, Severin

svrnm avatar Oct 30 '25 14:10 svrnm

Hi @svrnm,

are you aware of another workstream which focuses more on the reliable delivery?

We've noticed the work around the new 'new exporter helper' and it seems that others are also interested in guaranteed delivery of logs (for auditing)/metrics (for billing).

See: https://github.com/open-telemetry/opentelemetry-collector/issues/8122#issuecomment-3474821727

Maybe we can join forces there?

Thanks, Hilmar

hilmarf avatar Nov 03 '25 13:11 hilmarf

First of all, apologies for the delay, I was on PTO and then on KubeCon last week.

are you aware of another workstream which focuses more on the reliable delivery?

Not to my personal knowledge, but if there is some work being done in the collector SIG, @open-telemetry/collector-approvers would know best.

Not creating the SIG right now, does not mean that there is not an option to accomplish certain goals and requirements within the project that satisfy (some) of your needs. If there are workstreams in the collector that support audit logging requirements, then yes, joining forces is a great starting point!

svrnm avatar Nov 18 '25 07:11 svrnm

@hilmarf Speaking as a member of the Collector SIG, we would be happy to join forces with you on this front, specially now . I see you upvoted https://github.com/open-telemetry/opentelemetry-collector/issues/8122#issuecomment-3474821727, it would be interesting to discuss more about how you think we can adress this.

I am happy to chat more via DM and we can prepare something that you can present to the wider Collector SIG in one of our SIG meetings.

mx-psi avatar Nov 20 '25 11:11 mx-psi

@hilmarf @mlenkeit did you have a chance to chat with @mx-psi and the collector SIG? I think getting the audit work started there is a great way of making some traction, as from what I understand a lot of the work that is required needs to be done in the collector.

svrnm avatar Dec 04 '25 10:12 svrnm