logstash
logstash copied to clipboard
[Meta] Logstash to Logstash [Bootstrap]
User Needs:
- As a user with distributed Logstash needs, I want a supported, straight-forward, and stable way to send events from one Logstash pipeline to another Logstash pipeline that exists in either a separate process on the same host or in a different data-center, optionally secured with TLS/SSL.
- As a maintainer of Logstash-to-Logstash pipelines, I do not want to be concerned with how the events are emitted, transmitted, or encoded, so long as the upstream Logstash has a TCP/HTTP route to the downstream Logstash.
See also: https://github.com/elastic/logstash/issues/14477
Task: API Boundary Bootstrap
Introduce a Logstash "integration" plugin that contains an output plugin and an input plugin whose minimal configuration allow transmission of events from one Logstash pipeline to another using a host
, port
, and optional standard SSL settings.
For example, transmitting events over plaintext would be as simple as defining an output in the upstream pipeline targeting a host
and port
:
output { logstash { host => "ls.mycorpnet.internal" port => "9820" } }
-- upstream pipeline on
ls.mycorpnet.external
And an accompanying input in the downstream pipeline that binds to the same port:
input { logstash { port => "9820" } }
-- downstream pipeline on
ls.mycorpnet.internal
Details: Security
SSL/TLS communications must be configured using standardized SSL settings.
Details: Open-to-extend
The configuration of these plugins are intended to be an API boundary so that:
- pipeline authors can use them without needing to consider the implementation details of protocol and/or encoding
- plugin/logstash maintainers can evolve the implementation details without impacting in-the-wild configurations, including adding protocols and/or encoding mechanisms, along with negotiation to ensure the "best available" protocol and/or encoding is used when the sending and receiving Logstash pipelines have differing versions of the plugin.
In the initial release of this plugin set, batches of events should be transmitted using newline-delimited JSON over HTTP(s) using POST /events
requests, but these are internal implementation details. These details are not exposed to the user via plugin configuration options.
The behavior of the initial release of a logstash
input plugin is explicitly undefined when receiving bare TCP or TLS TCP connections, when receiving HTTP(s) requests with verbs other than POST
or to paths other than /events
, or with content-types other than application/x-ndjson
.
Bootstrap Requirements
- an output supports a single
host
/port
pair - http(s) with standard-form SSL identity and trust configuration (with workable subset)
- events sent via http(s) using
POST /events
and aContent-Type: application/x-ndjson
header - events received have zero implicit enrichment from the input plugin
Initial Extensions
- Startup Safety: an output that cannot connect to a downstream input should prevent the pipeline from starting (possibly after some reasonable delay)
- Shutdown Safety: an output that is being shut-down and is unable to deliver an in-flight batch in a timely manner should NACK the batch so that pipelines with PQ can re-attempt.
- Load-balancing: an output should be configurable with
hosts => ["12.34.56.78:1234", "12.34.56.79:1234"]
orhosts => ["12.34.56.78", "12.34.56.79"] port => 1234
(alternatives tohost => "12.34.56.78" port => 1234
) and should distribute events to the downstream hosts (naive whole-batch round-robin is ok at first, but then we could make it better) - SSL: fill gaps in identity and trust configuration.
- Metadata: add support for propagating
@metadata
Implementation Plan
### Bootstrap
- [x] https://github.com/logstash-plugins/logstash-integration-logstash/pull/2
- [ ] https://github.com/logstash-plugins/logstash-integration-logstash/issues/6
- [ ] https://github.com/logstash-plugins/logstash-integration-logstash/issues/4
- [ ] https://github.com/logstash-plugins/logstash-integration-logstash/issues/12
- [ ] https://github.com/logstash-plugins/logstash-integration-logstash/issues/1
- [ ] https://github.com/elastic/logstash/issues/15170
- [x] Release the `logstash-integration-logstash` plugin
- [ ] https://github.com/elastic/logstash/issues/15335
### Advanced Features
- [ ] https://github.com/logstash-plugins/logstash-integration-logstash/issues/11
- [ ] https://github.com/logstash-plugins/logstash-integration-logstash/issues/5
- [ ] https://github.com/elastic/logstash/issues/15498
### Testing
- [ ] https://github.com/elastic/logstash/issues/15496
- [ ] https://github.com/elastic/ingest-dev/issues/2574
Related docs issues:
- elastic/logstash#15169
- elastic/logstash#15213
Time to live TTL may be a useful output setting if logstash(s) is behind a load balancer, which terminates the connection every X minutes. When this happens it can cause duplicates in elastic when Logstash re-tries the terminated request.
Closing this issue as the work is now complete. If you would like to file enhancement request on the matter, please open new issues.