vector
vector copied to clipboard
ElasticSearch Sink: Add support for multiple ElasticSearch hosts
Current Vector Version
vector 0.10.0
Use-cases
In order to use Vector with ElasticSearch in a high availability (HA) production environment, we would like to be able to specify multiple ElasticSearch hosts.
See example of how Filebeat supports this: https://www.elastic.co/guide/en/beats/filebeat/current/elasticsearch-output.html#hosts-option
Attempted Solutions
Currently, it appears only possible to specify a single ElasticSearch host. With a single host, the ElasticSearch host being pointed at is a single point of failure.
Proposal
Add the ability to specify multiple ElasticSearch hosts, (see Filebeat for example).
Ref https://github.com/fluent/fluent-bit/issues/1340
@lmanners Thanks for this. Agree we'd like to do this and I've added it to our product backlog.
We could decide on the approach to sending to specific servers: round robin, failover, etc. We should check what Filebeat does here.
This seems related to https://github.com/timberio/vector/issues/4156 . I imagine we'll end up solving this in a general way that could be applied to most sinks.
I think you want to have a look at the sniffing mechanism most elastic components use: https://www.elastic.co/de/blog/elasticsearch-sniffing-best-practices-what-when-why-how
This is how filebeat, fluentd or logstash find all the nodes in a cluster and than use round robin to distribute the load.
:+1: Also missing this feature
The absence of this feature prevents the use of the vector in production.
The absence of this feature prevents the use of the vector in production.
We are currently using load balancer and multiple elasticsearch hosts behind it. But it will not survive complete region downtime including downtime of that particular balancer.
Is this still being considered? I see it keeps being removed from milestones so I'm just wondering.
The milestones switching was due to some old automation we had with Zendesk. It hasn't been prioritized, but we are still considering it (as well as broader client-side load balancing in Vector). Adding a 👍 reaction to the issue helps us track demand.
As @fpytloun mentioned, one possible workaround to this Vector limitation is to use load balancing. What he referred to sounds more like a reverse proxy, which becomes its own SPOF. But if you set up BGP anycasting in a creative way you can get around that. Still pretty cumbersome to have to do that when Elasticsearch is intended to be highly available without complicated multi-tier load balancing setups.
Building on top of what has been said.
Design
Add a feature for distributing requests between multiple hosts/endpoints.
Following options would be added to elasticsearch sink config:
# Additional endpoints. Optional.
distribution.endpoints = ["...","...", ...]
# Delay between trying to reactivate inactive endpoints. Optional, default 5 sec.
distribution.reactivate_delay = 5
Implementation
The basic idea is to add a tower layer which would choose to which service/host/endpoint to send a request.
Adding this layer would make a step towards #4156 in a way that it's applicable to other http based sinks.
tower has a layer/middleware for this. It uses Power of Two Random Choices algorithm for spreading load across endpoints. This would also be extendable with sniffing feature. Also, use of this middleware would require providing load factor to it.
For this implementation a simple flat constant as a load factor would be used. This would make the algorithm distribute the load randomly but uniformly between active endpoints. This could then be extended to track in flight requests per endpoint and make that a load factor, or it could be further enhanced by reusing statistics from ARC, which would add more active load balancing to this sink.
The positioning of this layer determines interaction it has with rate limiter, retry, and ARC layer/feature, in a significant way.
For context, this is the tower stack:
- Rate limit
- ARC
- Retry
- Timeout
- Service
Position above all
The layer would wrap the whole stack in the sink. This would mean:
- Being above retry layer, it can't retry with a different endpoint by default.
- Being above ARC, the layer can interpret pending services as being temporarily inactive hence ARC being an indirect source of load factor.
- Being above rate limiter, a limit would be applied on endpoint basis.
This would be a local change to Elasticsearch sink, but much of it would be reusable for other http based sinks.
An extension of this is to add additional retry layer on top of this to retry between different endpoints. This would effectively add fail-over feature to this sink. This could be configured by adding retry options to distribution option set.
Also an extension of this is to add additional rate limiter on top of this to apply limiter on the sink as a whole. Which seems necessary since without it there is no way to constrain the sink. This could be configured by adding rate limit options to distribution option set.
It would be highly advised to turn on ARC for this. Maybe we could even turn it on by default.
This placing would also mean that every other option outside distribution option set applies per endpoint making it clearer what does what. Which would also mean that by turning distribution on other options don't need to be changed.
Alternative - Position bellow retry layer
The layer would be added immediately bellow retry layer. This would mean:
- Being bellow retry layer, it can retry between different endpoints.
- Being bellow ARC layer will allow ARC to scale up or down concurrency as the number of underlying active endpoints change.
- Being bellow rate limiter, a limit will be applied on sink basis.
There are issues with this position:
- It's maybe optimistic that ARC, which was built with a single connection in mind, will be able to handle multiple ones disguised as a single one.
- Using ARC as a load factor or as an indicator of active/inactive state is not an option.
- This would be a broader change since it requires extending structures common to other sinks.
- The definition of options such as
request.rate_limit_numwould change depending on if distribution is on or off.
Alternative - Custom layer
The use of tower middleware would force us to use Power of Two Random Choices algorithm and its other design choices. An alternative would be to make custom layer similar to that middleware with a simpler algorithm like Round Robin.
cc. @jszwedko
Thanks for this write-up, @ktff !
I think the ability for it to retry across Elasticsearch instances will be important (though maybe @lmanners or @sim0nx can weigh in, as users).
Given that I think second alternative makes sense to me. I think ARC would behave similarly to if it was just a single-load balanced endpoint so this is probably ok (i.e. if Vector was pointed at a load balancer that balanced across multiple ES instances).
With regards to balancing, using that middleware seems attractive to me. I'd be satisfied with round-robin if we do need to diverge for any reason, though.
I'd like to get @bruceg's thoughts here too. I'll ask him to take a look this week.
Retry across Elasticsearch instances is doable in both cases. ~~In main case an additional retry layer can be added on top of this, while in alternative to reuse existing retry layer we would need to change responses/errors in this layer to signal retry layer that it's ok to retry.~~ EDIT: In both cases a new retry logic will be needed.
I think the ability for it to retry across Elasticsearch instances will be important (though maybe @lmanners or @sim0nx can weigh in, as users).
Yes this is definitely important.
I think ARC would behave similarly to if it was just a single-load balanced endpoint so this is probably ok (i.e. if Vector was pointed at a load balancer that balanced across multiple ES instances).
That's a good point. In that case I also think alternative makes more sense.
@bruceg it would be good to get your thoughts on this.
I think this situation demonstrates a problem with the layering model as it stands:
- We want to be able to retry on different hosts, so the distribution layer needs to go below retry.
- We want the distribution layer to have concurrency managed by ARC, so the ARC layer needs to go below distribution.
- We want retries to increase the effective RTT and back-off concurrency, so the retry layer needs to go below ARC.
Obviously, we can't do all three at the same time.
I think I agree with @jszwedko's analysis in general. As a layer this would fit best below retry. ARC is intended to manage concurrency in general, and does considerable averaging of RTTs before using its statistics. If the collection of ES instances have at least similar RTT characteristics, ARC will be able to manage them as though they were once instance. Tweaking the AWMA parameters may help bring more chaotic distributions into coherence as well. If the distribution layer needs RTT statistics, we could possibly split out a shared statistics gathering layer that could be used by both.
If the collection of ES instances have at least similar RTT characteristics, ARC will be able to manage them as though they were once instance.
That sounds good enough. I assume that would workout to something like:
- If all ES instances are in the same data center, ARC should work.
- If all ES instances are in the same region, ARC will probably work, some tweaking of AWMA parameters may be needed.
- Else, ARC probably won't work without heavy tweaking and even then may not work.
Considering that, we may want to make ARC opt in when distribution is enabled, or to avoid the extra magic some visible mention of this in distribution documentation could be added.
Related to splitting out statistics gathering layer, maybe if split in a right way it could be put bellow distribution layer which would also be a step in solving this circular dependency of layers. I'll see during implementation if something like that is doable.
Related to splitting out statistics gathering layer, maybe if split in a right way it could be put bellow distribution layer which would also be a step in solving this circular dependency of layers. I'll see during implementation if something like that is doable.
That is exactly what I was envisioning, yes.
Related to ARC segment in https://github.com/vectordotdev/vector/pull/13236#pullrequestreview-1055604956 it kinda became a requirement to put ARC bellow distribution so that each endpoint is managed by its own ARC. To do it, dependency circle needs to be broken, unfortunately there are two features that are practically mutually exclusive under such design:
- Failover
- ARC sensitivity to retries
-
We want retries to increase the effective RTT and back-off concurrency
-
They conflict in a way that for ARC to be sensitive to retries they need to happen with the same endpoint, while failover wants for retries to happen with different endpoints.
Between the two, ARC sensitivity to retries seems less important and could maybe me managed in some other way in ARC so I'm for prioritizing failover. That would leave us with two designs:
- Always failover on retries.
- Retry some amount of times on the same endpoint, if that fails then failover.
cc. @bruceg @tobz
Hello, guys. Is there any plans to add multiple sinks feature to Vector? It's blocker for our production use: LB will be a SPOF, and it'd be nice if Vector can balance record by himself.
Hello, guys. Is there any plans to add multiple sinks feature to Vector? It's blocker for our production use: LB will be a SPOF, and it'd be nice if Vector can balance record by himself.
There is an open PR for this: https://github.com/vectordotdev/vector/pull/14088 . The load balancing will be rudimentary, but will hopefully suffice for simple use-cases.
Regarding SPOF for an LB, it is common to use DNS to provide redundancy by having N load balancers behind the same A record. We generally recommend this over having Vector do the load balancing since load balancers will always have advanced load balancing features.
@jszwedko ah, got it. Can you please point me out how Vector works with DNS LB? Does it respect DNS TTLs? For example, can I use something like Consul DNS names with low TTL as a sink address? Will Vector resolve DNS name at each request to sink?
@jszwedko ah, got it. Can you please point me out how Vector works with DNS LB? Does it respect DNS TTLs? For example, can I use something like Consul DNS names with low TTL as a sink address? Will Vector resolve DNS name at each request to sink?
From what I can see, the crate we use for HTTP requests, hyper, actually re-resolves DNS for each new HTTP request via getaddrinfo. If you enable debug hyper logging (VECTOR_LOG=hyper=debug) you should see debug messages for when it re-resolves. The OS may do some caching (e.g. systemd-resolved) based on TTLs.