eliottree icon indicating copy to clipboard operation
eliottree copied to clipboard

Tasks reported out of order if some events are deleted / missing

Open epmoyer opened this issue 6 years ago • 4 comments

Jonathan,

Fantastic tool! I have been making great use of it!

I have an application which logs debug events when a debug option is enabled. I've been experimenting with the idea of always logging the debug information, and filtering the debug events out of the log when I don't want/need to see them. I found that if I delete an event it can cause eliot-tree to render the associated task out of (chronological) order (presumably because the parser is waiting for the missing event to show up in the log data, and then eventually gives up?)

Here's an example:

from eliot import start_action, to_file, Message
import subprocess

to_file(open("normal.log", "w"))
with start_action(action_type="first_action"):
    Message.log(message_type="some_message", level='debug')
    Message.log(message_type="some_message", level='info')

with start_action(action_type="second_action"):
    pass

# Create a version of the log file in which the 'debug' event has been deleted
with open('normal.log') as file:
    with open('filtered.log', 'w') as filtered_file:
        for line in file:
            if 'debug' not in line:
                filtered_file.write(line)

subprocess.call('eliot-tree --color never normal.log > normal.tree', shell=True)
subprocess.call('eliot-tree --color never filtered.log > filtered.tree', shell=True)

Which results in:

normal.tree

ceed1fcb-51fc-42a6-8219-e3157e98b402
└── first_action/1 ⇒ started
    ├── timestamp: 2017-11-04 02:31:24.041994
    ├── some_message/2
    │   ├── level: debug
    │   └── timestamp: 2017-11-04 02:31:24.042184
    ├── some_message/3
    │   ├── level: info
    │   └── timestamp: 2017-11-04 02:31:24.042294
    └── first_action/4 ⇒ succeeded
        └── timestamp: 2017-11-04 02:31:24.042397

47d6d5a3-4f7c-4276-8bae-014a9a2e2875
└── second_action/1 ⇒ started
    ├── timestamp: 2017-11-04 02:31:24.042547
    └── second_action/2 ⇒ succeeded
        └── timestamp: 2017-11-04 02:31:24.042654

filtered.tree

47d6d5a3-4f7c-4276-8bae-014a9a2e2875
└── second_action/1 ⇒ started
    ├── timestamp: 2017-11-04 02:31:24.042547
    └── second_action/2 ⇒ succeeded
        └── timestamp: 2017-11-04 02:31:24.042654

ceed1fcb-51fc-42a6-8219-e3157e98b402
└── first_action/1 ⇒ started
    ├── timestamp: 2017-11-04 02:31:24.041994
    ├── some_message/3
    │   ├── level: info
    │   └── timestamp: 2017-11-04 02:31:24.042294
    └── first_action/4 ⇒ succeeded
        └── timestamp: 2017-11-04 02:31:24.042397

Perhaps I am crazy to be trying to filter events, but it seems so close to working that I thought I'd ask what you thought.

Cheers!

epmoyer avatar Nov 04 '17 02:11 epmoyer

Thanks, Eric! I'm glad to hear it's been useful to you.

So your reasoning is mostly correct. It's Eliot's builtin parser and it's waiting for the next event in the level to occur sequentially, which won't happen because it doesn't exist in the data.

The good news is that eliottree has a way to filter events via the --select argument, which takes a JMESPath expression to apply to each logged task. So for example this should achieve what you want:

$ eliot-tree --select "level != 'debug'" normal.log

There's an open issue on ScatterHQ/eliot#289 about improving the documentation on this topic, if you want to follow along or suggest something that would have helped you I'm sure it would be appreciated.

jonathanj avatar Nov 05 '17 07:11 jonathanj

Winner! Thanks!

When I originally read the description of the --select syntax in the README I thought it was an inclusive filter only (I didn't realize I could use "!=" to make an exclusive statement) and I incorrectly thought "--select" could only be used to filter in/out a whole task. I think that's because the language "If any child task is selected the entire top-level task is selected." made me think a whole task was either in or out. I'll take a look at the open documentation issue and help out.

Thanks again for the quick response and for the excellent tool!

epmoyer avatar Nov 05 '17 18:11 epmoyer

Jonathan,

I've been running some experiments in hopes of writing a filtering cookbook page for the Eliot documentation.

Based on my observations (sample code below) I think the following statements are true. Can you confirm?

  1. Using --select to filter out events (with a != expression) does not guarantee that tasks will be rendered in order. [ See Case 1 and Case 5 in the example below; I find that the results are un-deterministic. Sometimes second_action is rendered before first_action ]
  2. The help says "If any child task is selected the entire top-level task is selected." In practice what I see is that A) the top level task's uuid is rendered B) the selected event(s) within that task are rendered. [ See Case 2 below. A single event is selected, and only that event and the associated top level task's uuid are rendered ]
  3. The help says that --select "...can be specified multiple times to mimic logical AND.". Since JMESPath supports the && operator, I find that ANDs can be implemented in a single --select [ See Case 4 and 5 below ]. I couldn't find a way to get multiple --select parameters to work [ See cases 'Multi Select' 0 through 2 below; passing multiple selects neither seemed to behave like OR nor like AND]

Thanks!

from eliot import start_action, to_file, Message
import subprocess

to_file(open("example.log", "w"))
with start_action(action_type="first_action"):
    Message.log(message_type="some_message", level='in_test', data=list(range(3)))
    Message.log(message_type="some_message", level='in_production', data='spam')

try:
    with start_action(action_type="second_action"):
        Message.log(message_type="some_message", level='in_test', data=list(range(3, 5)))
        Message.log(message_type="some_message", level='in_production', data='eggs')
        raise RuntimeError("Broken pram")
except:
    pass

select_expressions = [
    None,
    '"level != \'in_test\'"',
    '"action_status == \'failed\'"',
    '"task_level[0] == \\`3\\`"',
    '"level == \'in_test\' && data[0] == \\`3\\`"',
    '"level != \'in_test\' && level != \'in_production\'"',
    ]

print('')
for case, select_expression in enumerate(select_expressions):
    command = 'eliot-tree'
    if select_expression:
        command += ' --select ' + select_expression 
    command += ' example.log'
    print('Case: {}'.format(case))
    print(command)
    print('=' * len(command))
    subprocess.call(command, shell=True)

select_expressions = [
    ('"level != \'in_test\'"', '"level != \'in_production\'"'),
    ('"level == \'in_test\'"', '"level == \'in_production\'"'),
    ('"level == \'in_test\'"', '"level == \'in_test\'"'),
    ]
    
for case, (select_expression1, select_expression2) in enumerate(select_expressions):
    print('Case: Multi Select {}'.format(case))
    command = 'eliot-tree --select {} --select {} example.log'.format(select_expression1, select_expression2)
    print(command)
    print('=' * len(command))
    subprocess.call(command, shell=True)

Output from the above script:

~/Documents/Personal/Programming/Python/eliot_test $python ./test_filter.py

Case: 0
eliot-tree example.log
======================
023238aa-f4c9-4396-b295-099ef73999c4
└── first_action/1 ⇒ started
    ├── timestamp: 2017-11-06 02:27:22.293421
    ├── some_message/2
    │   ├── data:
    │   │   ├── 0: 0
    │   │   ├── 1: 1
    │   │   └── 2: 2
    │   ├── level: in_test
    │   └── timestamp: 2017-11-06 02:27:22.293635
    ├── some_message/3
    │   ├── data: spam
    │   ├── level: in_production
    │   └── timestamp: 2017-11-06 02:27:22.293756
    └── first_action/4 ⇒ succeeded
        └── timestamp: 2017-11-06 02:27:22.293869

dcf437c3-565e-4167-9d0a-a6b702c7fd56
└── second_action/1 ⇒ started
    ├── timestamp: 2017-11-06 02:27:22.294030
    ├── some_message/2
    │   ├── data:
    │   │   ├── 0: 3
    │   │   └── 1: 4
    │   ├── level: in_test
    │   └── timestamp: 2017-11-06 02:27:22.294143
    ├── some_message/3
    │   ├── data: eggs
    │   ├── level: in_production
    │   └── timestamp: 2017-11-06 02:27:22.294250
    └── second_action/4 ⇒ failed
        ├── exception: builtins.RuntimeError
        ├── reason: Broken pram
        └── timestamp: 2017-11-06 02:27:22.294369

Case: 1
eliot-tree --select "level != 'in_test'" example.log
====================================================
dcf437c3-565e-4167-9d0a-a6b702c7fd56
└── second_action/1 ⇒ started
    ├── timestamp: 2017-11-06 02:27:22.294030
    ├── some_message/3
    │   ├── data: eggs
    │   ├── level: in_production
    │   └── timestamp: 2017-11-06 02:27:22.294250
    └── second_action/4 ⇒ failed
        ├── exception: builtins.RuntimeError
        ├── reason: Broken pram
        └── timestamp: 2017-11-06 02:27:22.294369

023238aa-f4c9-4396-b295-099ef73999c4
└── first_action/1 ⇒ started
    ├── timestamp: 2017-11-06 02:27:22.293421
    ├── some_message/3
    │   ├── data: spam
    │   ├── level: in_production
    │   └── timestamp: 2017-11-06 02:27:22.293756
    └── first_action/4 ⇒ succeeded
        └── timestamp: 2017-11-06 02:27:22.293869

Case: 2
eliot-tree --select "action_status == 'failed'" example.log
===========================================================
dcf437c3-565e-4167-9d0a-a6b702c7fd56
└── <unnamed>
    └── second_action/4 ⇒ failed
        ├── exception: builtins.RuntimeError
        ├── reason: Broken pram
        └── timestamp: 2017-11-06 02:27:22.294369

Case: 3
eliot-tree --select "task_level[0] == \`3\`" example.log
========================================================
023238aa-f4c9-4396-b295-099ef73999c4
└── <unnamed>
    └── some_message/3
        ├── data: spam
        ├── level: in_production
        └── timestamp: 2017-11-06 02:27:22.293756

dcf437c3-565e-4167-9d0a-a6b702c7fd56
└── <unnamed>
    └── some_message/3
        ├── data: eggs
        ├── level: in_production
        └── timestamp: 2017-11-06 02:27:22.294250

Case: 4
eliot-tree --select "level == 'in_test' && data[0] == \`3\`" example.log
========================================================================
dcf437c3-565e-4167-9d0a-a6b702c7fd56
└── <unnamed>
    └── some_message/2
        ├── data:
        │   ├── 0: 3
        │   └── 1: 4
        ├── level: in_test
        └── timestamp: 2017-11-06 02:27:22.294143

Case: 5
eliot-tree --select "level != 'in_test' && level != 'in_production'" example.log
================================================================================
dcf437c3-565e-4167-9d0a-a6b702c7fd56
└── second_action/1 ⇒ started
    ├── timestamp: 2017-11-06 02:27:22.294030
    └── second_action/4 ⇒ failed
        ├── exception: builtins.RuntimeError
        ├── reason: Broken pram
        └── timestamp: 2017-11-06 02:27:22.294369

023238aa-f4c9-4396-b295-099ef73999c4
└── first_action/1 ⇒ started
    ├── timestamp: 2017-11-06 02:27:22.293421
    └── first_action/4 ⇒ succeeded
        └── timestamp: 2017-11-06 02:27:22.293869

Case: Multi Select 0
eliot-tree --select "level != 'in_test'" --select "level != 'in_production'" example.log
========================================================================================
abe430d8-d09b-4ee4-ac42-b62a52c7e966
└── first_action/1 ⇒ started
    ├── timestamp: 2017-11-06 02:36:40.335650
    ├── some_message/2
    │   ├── data:
    │   │   ├── 0: 0
    │   │   ├── 1: 1
    │   │   └── 2: 2
    │   ├── level: in_test
    │   └── timestamp: 2017-11-06 02:36:40.336180
    ├── some_message/3
    │   ├── data: spam
    │   ├── level: in_production
    │   └── timestamp: 2017-11-06 02:36:40.336401
    └── first_action/4 ⇒ succeeded
        └── timestamp: 2017-11-06 02:36:40.336613

c6e955f9-654a-4d2f-b9ab-cd65da4ee34c
└── second_action/1 ⇒ started
    ├── timestamp: 2017-11-06 02:36:40.336910
    ├── some_message/2
    │   ├── data:
    │   │   ├── 0: 3
    │   │   └── 1: 4
    │   ├── level: in_test
    │   └── timestamp: 2017-11-06 02:36:40.337120
    ├── some_message/3
    │   ├── data: eggs
    │   ├── level: in_production
    │   └── timestamp: 2017-11-06 02:36:40.337317
    └── second_action/4 ⇒ failed
        ├── exception: builtins.RuntimeError
        ├── reason: Broken pram
        └── timestamp: 2017-11-06 02:36:40.337536

Case: Multi Select 1
eliot-tree --select "level == 'in_test'" --select "level == 'in_production'" example.log
========================================================================================
Case: Multi Select 2
eliot-tree --select "level == 'in_test'" --select "level == 'in_test'" example.log
==================================================================================
~/Documents/Personal/Programming/Python/eliot_test $

epmoyer avatar Nov 06 '17 02:11 epmoyer

Eric, some comprehensive documentation for this feature would be great! In fact #37 has been open for a while. Documentation like this seems like it ought to be in eliottree rather than Eliot, however Eliot has a well structured documentation resource already. I'm not personally too phased about this since I'd appreciate the documentation either way, it might be worth running it past Itamar on the Eliot issue. Nevertheless, I appreciate your effort and enthusiasm, thank you!

Recently-ish I ditched my own tree parser implementation in favour of the stricter one in Eliot, it's not impossible that subtle changes went unnoticed. Let me address some of your points:

  1. The way the filtering works now, it throws away task dictionaries from the input before building the tree, so when Eliot parses them they're not there. What happens in Eliot's parser is that any tree with any incomplete child tasks is itself considered incomplete. These are attached to the end of the output, since they won't ever appear in the parser.add result. The order of these is not well-specified. There are probably two separate considerations here:

    • Investigate the possibility of filtering the parsed tree to avoid intentionally creating incomplete trees.
    • Outputting a header like "Incomplete tasks" for truly incomplete tasks.
  2. I'm reasonably sure that this is a regression and that it used to work the way the documentation describes. Do you have a preferred mode of operation? I can see a benefit for each case:

    • Filter subtasks from the tree in order to cut down noise. (Current behaviour.)
    • Select whole tasks where any sub task matches the expression. (Previous behaviour.)
  3. The && behaviour in jmespath is fairly new. It was introduced in 0.9.0 and eliottree depends on 0.7.1 or greater. Having said that, it definitely looks like multiple selects behaviour is broken. Looking at the code it would appear that they're not being properly combined.

To summarise my thoughts:

  • Filter/select tasks after parsing all the original input, instead of before.
  • Clearly separate the listing of complete and incomplete tasks, possibly also sorting the incomplete tasks.
  • Implementing --filter and --select as mutually exclusive options is a good change, since there is a use for both modes of operation
  • --select is confusing is because I conflated the concept of filtering and selecting in a few parts of the implementation.
  • Bump the jmespath dependency and stop supporting multiple --select arguments.

I'd be keen to hear your thoughts.

jonathanj avatar Nov 06 '17 07:11 jonathanj