WIP, COVERS ELASTICSEARCH 5.5.x, UPDATING TO ES 6.5.x

Operating Elasticsearch

for Fun and Profit

Fred de Villamil

Read online

Code of Conduct

Behave like normal, friendly, welcoming human beings or get the hell out.
Any reference to a non scientific, verifiable element is irrelevant.

Getting Started with Elasticsearch
- Prerequisites
- Elasticsearch basic concepts
  - REST APIs
  - Open Source
  - Java
  - Distributed
  - Scalable
  - Fault tolerant
- What's an Elasticsearch cluster?
  - Master node
  - Ingest nodes
  - Data Nodes
  - Tribe Nodes
  - A Minimal, Fault Tolerant Elasticsearch Cluster
- What's an Elasticsearch index
- Deploying your first Elasticsearch cluster
  - Deploying Elasticsearch on Debian
  - Deploying Elasticsearch on RHEL / CentOS
- First step using Elasticsearch
- Elasticsearch Configuration
- Elasticsearch Plugins
Elasticsearch and the Java Virtual Machine
- Supported JVM and operating systems / distributions
  - Operating system matrix
  - Java Virtual Machine matrix
- Memory management
- Garbage collection
  - Concurrent Mark & Sweep Garbage Collector
  - Garbage First Garbage Collector
A few things you need to know about Lucene
- Lucene segments
- Lucene deletes and updates
Designing the Perfect Elasticsearch Cluster
- Elasticsearch is elastic, for real
- Design for failure
- Hardware
  - CPU
  - Memory
  - Network
  - Storage
- Software
  - The Linux (or FreeBSD) kernel
  - The Java Virtual Machine
  - The filesystem
- Designing your indices
  - Sharding
  - Replication
- Optimising allocation
- Troubleshooting and scaling
  - CPU
  - Memory
Design for Event Logging
- Design of an event logging infrastructure cluster
  - Throughput: how many events per second (005-design-event-logging/005-design-event-logging.md//eps) are you going to collect?
  - Retention: how long do you want to keep your data, hot and cold?
  - Size: what is the average size of a collected event?
  - Fault tolerance: can you afford losing your indexed data?
  - Queries
- Which hardware do I need?
- How to design my indices?
- What about some tuning?
Operating Daily
- Elasticsearch most common operations
  - Mass index deletion with pattern
  - Mass optimize, indexes with the most deleted docs first
  - Restart a cluster using rack awareness
  - Optimize your cluster restart
  - Remove data nodes from a cluster the safe way
- Get useful information about your cluster
  - Nodes information
  - Monitor your search queues
  - Indices information
  - Shard allocation information
  - Recovery information
  - Segments information (006-operating-daily/006-operating-daily.md//can be extremely verbose)
  - Cluster stats
  - Nodes stats
  - Indice stats
  - Indice mapping
  - Indice settings
  - Cluster dynamic settings
  - All the cluster settings (006-operating-daily/006-operating-daily.md//can be extremely verbose)
Monitoring Elasticsearch
- Tools
- Monitoring at the host level
- Monitoring at the node level
- Monitoring at the cluster level
- Monitoring at the index level
How we reindexed 36 billion documents in 5 days within the same Elasticsearch cluster
- The "Blackhole" cluster
- Elasticsearch configuration
- Tuning the Java virtual machine
  - Blackhole Initial indexing
- Blackhole initial migration
- Blackhole reindexing
  - The reindexing process
  - Logstash configuration
  - Reindexing Elasticsearch configuration
  - Introducing Yoko and Moulinette
- Reindexing in 5 days
- Conclusion
Use Case: Migrating a Cluster Across the Ocean Without Downtime
Use Case: An Advanced Elasticsearch Architecture for High-volume Reindexing
- A glimpse at our infrastructure
- Using Elasticsearch for fun and profit
- Conclusion
Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours with 0 Downtime and a Rollback Strategy
- Elasticsearch @Synthesio, November 2017
- The Blackhole Cluster
- Migration Strategies: Cluster restart VS Reindex API VS Logstash VS the Fun Way
  - The Cluster Restart Strategy
  - The Reindex API Strategy
  - The Logstash Strategy
  - The Fun Way
- Migrating Blackhole for Real
  - Expanding Blackhole
  - Splitting Blackhole in 2
- Conclusion

Styling

This is the Markdown styling used in this book. If you plan to contribute, please use it.

Chapter title

# This is a chapter title

Chapter part

---

## A chapter part title is preceded by an horizontal line

Chapter subpart

### A level 1 subpart
#### A level 2 subpart

Images

![An image should have an alt text](use/a/relative.link)

Code:

An `inline code block` goes like this

API calls go the Curl way

curl -X POST "localhost:9200/_search" -H 'Content-Type: application/json' -d'
{
    "query" : {
        "match_all" : {}
    },
    "stats" : ["group1", "group2"]
}
'

Yaml code is expanded for more readability

---
some:
  value:
    goes: "like this"

Links

[An internal link](has/a/relative.path)
[An external link](https://has.an.absolute/path)

Lists

Urdered lists:

Only one line break between a paragraph and

* An
* unordered
* list
	* with
	* subitems

Ordered lists:

1. An
2. Ordered
3. List
	1. With
	2. subitems

running-elasticsearch-fun-profit
running-elasticsearch-fun-profit copied to clipboard

Metadata

Operating Elasticsearch

for Fun and Profit

Fred de Villamil

Read online

Code of Conduct

TOC

Styling

Chapter title

Chapter part

Chapter subpart

Images

Code:

Links

Lists

← Metadata

Owner

Metadata

running-elasticsearch-fun-profit running-elasticsearch-fun-profit copied to clipboard

Metadata

Operating Elasticsearch

for Fun and Profit

Fred de Villamil

Read online

Code of Conduct

TOC

Styling

Chapter title

Chapter part

Chapter subpart

Images

Code:

Links

Lists

← Metadata

Owner

Metadata

running-elasticsearch-fun-profit
running-elasticsearch-fun-profit copied to clipboard