libgitops icon indicating copy to clipboard operation
libgitops copied to clipboard

Extract the framing functionality to its own package

Open luxas opened this issue 4 years ago • 0 comments

This is the first step in essentially making a "v2" rewrite of libgitops.

Progress Plan

The rewrite starts from the elementary operation of reading/writing YAML/JSON frames, which is needed everywhere else. After this, I'll take a stab at making the serializer package more robust, observable, unit-tested, and so on, more production grade. After the serializer is "production-grade" in turn, I'll progress towards making the storage system implement the controller-runtime Client interface so that we can have an object-oriented approach when reading/writing objects to Git and other backends. The serializer and storage improvements are already prototyped in #46, and will be ported from there in their own PRs, such that it is easier to review.

Why make/have a framing library?

Every time someone interacts with a Kubernetes configuration object, they are touching either JSON or YAML. If that configuration is declarative, it is stored somewhere in encoded form. If that storage is Git, for GitOps purposes, the configuration is stored in files exactly as the user wishes. The user can put many YAML objects in the same file by separating them in different YAML documents, there can be only one object per file, the formatting can be weird, etc. Parsing all this by hand is brittle and too complex to do at each call site where this functionality is needed.

JSON is a self-framing format, which means that no extra separator like --- is needed, but it is not straightforward for a reader of the file stream to know where the JSON object ends. Other than declarative configuration, one place where it is common to output JSON objects after each other is logging, where the stream looks like { ... }{ ... }{ ... }. If a machine would like to read that log, it'd need to be able to frame individual objects in one way or the other.

Solution

This frame package contains on a high-level a Reader and Writer that perform framing in either direction, respectively. They provide an easy-to-use interface for interacting with the underlying io.Reader or io.Writer stream (which can be a file, a HTTP request, a log, etc.).

The content package provides a means to automatically trace read/write requests and carry metadata about the source or the target byte stream.

The tracing package has higher-level utilities for tracing the frame/sanitation/content streams, or the library more generally.

The sanitize package allows for keeping YAML comments automatically, formatting JSON/YAML prettily, keeping sequence indentation as-is, and such with the goal of minimizing the textual diff when we've written back to Git.

Features and Design Goals

  • Crucial production-grade features such as limiting the byte count and the number of frames are included out of the box.
  • Thread-safe accesses to the underlying resource is automatically built-in.
  • Frames are automatically sanitized, and empty frames are skipped.
  • Traces are automatically created when reading/writing a frame and closing the stream and exported using OpenTelemetry to e.g. Jaeger. Errors during read/write/close are automatically propagated to Jaeger as well, and some metadata.
  • context.Context is used everywhere.
  • Using the functional Option pattern one can add more options and features over time without breaking the API surface.
  • If the maximum number of frames is just one, there's support for using any content type (think HCL, TOML or similar), if no framing is needed. In that case, all of the other listed features here are included "for free".
  • Sanitation: Comments can be copied over automatically, YAML/JSON can be pretty-printed in a standard way, and YAML indentation of lists can be automatically kept or force-formatted.

This PR also contains pkg/tracing which contains some higher-level tools for working with OpenTracing. For example, FuncTracer allows a function that is passed in to be automatically instrumented with a trace span, and then have its error reported to Jaeger or an other backend automatically. There is also a constructor of a TracerProvider that follows the builder-pattern, so setting up tracing for your application is as seamless/easy as possible. When this PR is merged, I'll also make a converter shim which automatically converts traces into logs to stdout or some other sink, without the app developer having to write any extra logging code.

luxas avatar Jul 07 '21 15:07 luxas