running-elasticsearch-fun-profit icon indicating copy to clipboard operation
running-elasticsearch-fun-profit copied to clipboard

A book about running Elasticsearch

WIP, COVERS ELASTICSEARCH 5.5.x, UPDATING TO ES 6.5.x

Operating Elasticsearch

for Fun and Profit


Fred de Villamil


Read online


Code of Conduct

  • Behave like normal, friendly, welcoming human beings or get the hell out.
  • Any reference to a non scientific, verifiable element is irrelevant.

TOC

  • Getting Started with Elasticsearch

    • Prerequisites
    • Elasticsearch basic concepts
      • REST APIs
      • Open Source
      • Java
      • Distributed
      • Scalable
      • Fault tolerant
    • What's an Elasticsearch cluster?
      • Master node
      • Ingest nodes
      • Data Nodes
      • Tribe Nodes
      • A Minimal, Fault Tolerant Elasticsearch Cluster
    • What's an Elasticsearch index
    • Deploying your first Elasticsearch cluster
      • Deploying Elasticsearch on Debian
      • Deploying Elasticsearch on RHEL / CentOS
    • First step using Elasticsearch
    • Elasticsearch Configuration
    • Elasticsearch Plugins
  • Elasticsearch and the Java Virtual Machine

    • Supported JVM and operating systems / distributions
      • Operating system matrix
      • Java Virtual Machine matrix
    • Memory management
    • Garbage collection
      • Concurrent Mark & Sweep Garbage Collector
      • Garbage First Garbage Collector
  • A few things you need to know about Lucene

    • Lucene segments
    • Lucene deletes and updates
  • Designing the Perfect Elasticsearch Cluster

    • Elasticsearch is elastic, for real
    • Design for failure
    • Hardware
      • CPU
      • Memory
      • Network
      • Storage
    • Software
      • The Linux (or FreeBSD) kernel
      • The Java Virtual Machine
      • The filesystem
    • Designing your indices
      • Sharding
      • Replication
    • Optimising allocation
    • Troubleshooting and scaling
      • CPU
      • Memory
  • Design for Event Logging

    • Design of an event logging infrastructure cluster
      • Throughput: how many events per second (005-design-event-logging/005-design-event-logging.md//eps) are you going to collect?
      • Retention: how long do you want to keep your data, hot and cold?
      • Size: what is the average size of a collected event?
      • Fault tolerance: can you afford losing your indexed data?
      • Queries
    • Which hardware do I need?
    • How to design my indices?
    • What about some tuning?
  • Operating Daily

    • Elasticsearch most common operations
      • Mass index deletion with pattern
      • Mass optimize, indexes with the most deleted docs first
      • Restart a cluster using rack awareness
      • Optimize your cluster restart
      • Remove data nodes from a cluster the safe way
    • Get useful information about your cluster
      • Nodes information
      • Monitor your search queues
      • Indices information
      • Shard allocation information
      • Recovery information
      • Segments information (006-operating-daily/006-operating-daily.md//can be extremely verbose)
      • Cluster stats
      • Nodes stats
      • Indice stats
      • Indice mapping
      • Indice settings
      • Cluster dynamic settings
      • All the cluster settings (006-operating-daily/006-operating-daily.md//can be extremely verbose)
  • Monitoring Elasticsearch

    • Tools
    • Monitoring at the host level
    • Monitoring at the node level
    • Monitoring at the cluster level
    • Monitoring at the index level
  • How we reindexed 36 billion documents in 5 days within the same Elasticsearch cluster

    • The "Blackhole" cluster
    • Elasticsearch configuration
    • Tuning the Java virtual machine
      • Blackhole Initial indexing
    • Blackhole initial migration
    • Blackhole reindexing
      • The reindexing process
      • Logstash configuration
      • Reindexing Elasticsearch configuration
      • Introducing Yoko and Moulinette
    • Reindexing in 5 days
    • Conclusion
  • Use Case: Migrating a Cluster Across the Ocean Without Downtime

  • Use Case: An Advanced Elasticsearch Architecture for High-volume Reindexing

    • A glimpse at our infrastructure
    • Using Elasticsearch for fun and profit
    • Conclusion
  • Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours with 0 Downtime and a Rollback Strategy

    • Elasticsearch @Synthesio, November 2017
    • The Blackhole Cluster
    • Migration Strategies: Cluster restart VS Reindex API VS Logstash VS the Fun Way
      • The Cluster Restart Strategy
      • The Reindex API Strategy
      • The Logstash Strategy
      • The Fun Way
    • Migrating Blackhole for Real
      • Expanding Blackhole
      • Splitting Blackhole in 2
    • Conclusion

Styling

This is the Markdown styling used in this book. If you plan to contribute, please use it.

Chapter title

# This is a chapter title

Chapter part

---

## A chapter part title is preceded by an horizontal line

Chapter subpart

### A level 1 subpart
#### A level 2 subpart

Images

![An image should have an alt text](use/a/relative.link)

Code:

An `inline code block` goes like this

API calls go the Curl way

curl -X POST "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query" : {
        "match_all" : {}
    },
    "stats" : ["group1", "group2"]
}
'

Yaml code is expanded for more readability

---
some:
  value:
    goes: "like this"

Links

[An internal link](has/a/relative.path)
[An external link](https://has.an.absolute/path)

Lists

Urdered lists:

Only one line break between a paragraph and

* An
* unordered
* list
	* with
	* subitems

Ordered lists:

1. An
2. Ordered
3. List
	1. With
	2. subitems