logstash-filter-verifier
logstash-filter-verifier copied to clipboard
Tests rearranging themselves
Hi,
We're encountering an issue sometimes where the LFV outputs decide to rearrange themselves on different runs, which causes our GitLab CI to fail. When I say LFV outputs, I'm talking about when we use a clone{} filter. Upon rerunning it the tests will run in the right order and sometimes it'll pass.
An example using pseudo logic:
Test Input: Foo
Expected Output: Bar
Second Output: Baz (Baz is the Bar output but transformed to be entity-centric)
Actual Output in LFV: Baz
Second Output: Bar
So then the test will fail, because instead of expecting Bar to be the first results returned, we get Baz and vice versa.
Is there an end-user way of ordering these events that we should be following? Or is there a way that we can adjust the code to sort the test outputs in a specific way etc.? Let me know if I need to explain the issue differently :)
Cheers, Aaron
The first thing, that comes to my mind is pipeline.workers
. Have you tried to set this value explicitly to 1
?
And pipeline.ordered
as well. See: https://www.elastic.co/guide/en/logstash/current/logstash-settings-file.html
And
pipeline.ordered
as well. See: https://www.elastic.co/guide/en/logstash/current/logstash-settings-file.html
Will take a look! Thank you :)
Hi,
I've tried running LS 7.9 and LFV using --logstash-arg=--pipeline.workers --logstash-arg 1 --logstash-arg=--pipeline.ordered --logstash-arg true
and I still get the same issue. Are there any issues with the above?
I am not sure, if there is a solution for this problem, if pipeline.workers
and pipeline.ordered
do not help. LFV processes the events in the order they are returned from Logstash.
Can you elaborate why you need the clone
filter?
We need the clone
filter to, for example:
Take one event, if it contains a certain event type, clone it and then run filters on it to transform it, ready to insert it into a different index. The original event is still preserved and therefore goes into the original index - both are equally useful documents.
Another, for example, is to run some Elasticsearch filtering on a cloned event (as well as pruning fields etc.) to then insert into an entity-centric index. This for example, can be the last state of each user we've seen in our database that we then can aggregate on.
We of course, need to test that this cloning and processing works, and continues to work, when changes are made.
I am still seeing this with the use of clone
and split
filters where entire test suites have to be rewritten on every config change. This is very difficult to maintain with a large test suite.
I've tried to reproduce a minimal test case that illustrates the problem but I've been unable to do so with a small config that I am able to share.
I believe this issue is a dupe of #150