curator icon indicating copy to clipboard operation
curator copied to clipboard

Curator -- a build and package automation tool

================================================= curator -- Artifact and Repository Management

Overview

Curator is a tool that we use at MongoDB to generate our package repositories (e.g. repo.mongodb.org and repo.mongodb.com). Additionally, curator provides tooling to support the automated publication and management of build artifacts as part of our continuous integration system.

Components

Please refer to the --help output for curator and its sub-commands. The following sections provides overviews of these components and their use.

S3 Tools


Curator includes a basic set of AWS S3 operations on the
command line. Although the get, put, and delete operations are faily
basic the ``sync-to`` and ``sync-from`` operations provide parallel
directory tree sync operations between S3 buckets and the local file
system. These operations compare objects using MD5 checksums, do
multi-threaded file uploads, and retry failed operations with
exponential backoff, for efficient and robust transfers.

Repobuilder

The repobuilder is a helper package to submit Barque jobs to build RPM and DEB package repositories in S3. These jobs: sync files from an existing repository, add packages from the local filesystem to the repository, sign packages (for RPM), regenerate package metadata, sign package metadata, generate html pages for web-based display, and sync the changed files to the remote repository.

The current implementation of the repobuilder process depends on external repository generation tools (e.g. createrepo and apt tools.) Additionally, the repobuilder currently depends on MongoDB's internal signing service.

Artifacts


The artifacts functionality uses the release metadata feeds
(e.g. ``https://downloads.mongodb.org/full.json``) to fetch and
extract release build artifacts for local use. It is particularly
useful for fetching artifacts for and maintaining local caches of
MongoDB builds for multiple releases. Set the
``CURATOR_ARTIFACTS_DIRECTORY`` environment variable or pass the
``--path`` option to a flag, and then use the ``curator artifacts
download`` command to download files.

The ``artifacts`` command also includes two exploration subcommands
for discovering available builds: Use the ``list-map`` for specific
lists of available edition, target, and architecture combinations and
``list-all`` for a list of available target and architectures. Both
list operations are specific to a single version.

Combine the artifacts tool with the prune tool to avoid unbounded
cache growth.

Prune
~~~~~

Prune is based on the `lru <https://github.com/evergreen-ci/lru>`_
library, and takes a file system tree and removes files, based on age,
until the total size of the files is less than a specified maximum
size. Prune uses modified time for age, in an attempt to have
consistent behavior indepenent of operation system and file system.

There are two modes of operation, a recursive mode which removes
objects from the tree recursively, but skips directory objects, and
directory mode, which does not collect objects recursively, but tracks
the size for the contents--recursively--of top-level directories.

Development
-----------

Design Principles and Goals
  • All operations in the continuous integration environment should be easily reproducible outside of the environment. In practice, curator exists to build repositories inside of Evergreen tasks; however, it is possible to run all stages of the repository building process by hand. The manual abilities are useful and required for publishing package revisions and repairing corrupt repositories.

  • Leverage, as possible, third party libraries and tools. For example, the cache pruning and artifact management functionality is entirely derived from third-party repositories maintained separately from curator.

  • Major functionality is implemented and executed in terms of amboy <https://github.com/mongodb/amboy>_ jobs and queues. Not only does this provide a framework for task execution, but leaves the door open to provide curator functionality as a highly available service with minimal modification.

APIs and Documentation


See the `godoc API documentation <http://godoc.org/github.com/mongodb/curator>`_
for more information about curator interfaces and internals.

Issues
~~~~~~

Please file all issues in the `EVG project
<https://jira.mongodb.org/browse/EVG>`_ in the `MongoDB Jira
<https://jira.mongodb.org/>`_ instance.