experta icon indicating copy to clipboard operation
experta copied to clipboard

EXISTS doesn't want to match

Open amsimms opened this issue 5 years ago • 4 comments

I have a rule with this structure:

@Rule(
        # NOT(Excluded(patient_id=MATCH.patient_id)),
        EXISTS(
            FeverGreaterThanFourDays(
                patient_id=MATCH.patient_id,
                appointment_id=MATCH.appointment_id,
                weight=MATCH.fever_weight,
                present=W(),
            ),
            ConjunctivalInjection(
                patient_id=MATCH.patient_id,
                appointment_id=MATCH.appointment_id,
                weight=MATCH.conjunctival_weight,
                present=W(),
            ),
      # etc...
    )
   def _ (patient_id, appointment_id, **kwargs):
     # my function.

These are some sample facts:

FactList([(0, InitialFact()),
          (1, Patient(age=5, patient_id=5223)),
          (2, Patient(age=45, patient_id=1223)),
          (3,
           Objective(patient_id=5223, appointment_id=2, code='42631002', code_system='SNOMEDCT')),
          (4,
           Objective(patient_id=5223, appointment_id=2, code='82014009', code_system='SNOMEDCT')),
          (5,
           Objective(patient_id=5223, appointment_id=2, code='99999984', code_system='SNOMEDCT'))])

If I replace "EXISTS" with "OR", I can get matches almost as documented--basically I see the rule fire for each fact that matches. Unfortunately this is not what the doc says--it says it should fire for each combination of facts that match, so I'd expect to see it fire on individual facts, then pairs, then triples, etc. I suspect this is just an error in the docs.

This is the result using "OR":

Code system match: SNOMEDCT
Calculating initial score for 5223 (appointment 2) on available data: dict_items([('rash_weight', 0)])
Code system match: SNOMEDCT
Calculating initial score for 5223 (appointment 2) on available data: dict_items([('peripheral_weight', 1)])
Code system match: SNOMEDCT
Calculating initial score for 5223 (appointment 2) on available data: dict_items([('strawberry_tongue_weight', 1)])
Patient 1223 does not meet age criteria 45 > 10

Resulting facts are:
FactList([(0, InitialFact()),
          (1, Patient(age=5, patient_id=5223)),
          (2, Patient(age=45, patient_id=1223)),
          (3,
           Objective(patient_id=5223, appointment_id=2, code='42631002', code_system='SNOMEDCT')),
          (4,
           Objective(patient_id=5223, appointment_id=2, code='82014009', code_system='SNOMEDCT')),
          (5,
           Objective(patient_id=5223, appointment_id=2, code='99999984', code_system='SNOMEDCT')),
          (6,
           PolymorphousRash(patient_id=5223, appointment_id=2, weight=0, present=False)),
          (7, InitialRiskScore(patient_id=5223, appointment_id=2, score=0)),
          (8,
           PeripheralEdema(patient_id=5223, appointment_id=2, weight=1, present=True)),
          (9, InitialRiskScore(patient_id=5223, appointment_id=2, score=1)),
          (10,
           StrawberryTongue(patient_id=5223, appointment_id=2, weight=1, present=True)),
          (11, Excluded(patient_id=1223))])

However, with the same set of facts that match with OR, if I change from OR to EXISTS it ceases matching. My expectation is that the rule will fire once, and that the associated function will receive all the weight parameters for matching facts. Instead none of the facts match...

Code system match: SNOMEDCT
Code system match: SNOMEDCT
Code system match: SNOMEDCT
Patient 1223 does not meet age criteria 45 > 10

Resulting facts are:
FactList([(0, InitialFact()),
          (1, Patient(age=5, patient_id=5223)),
          (2, Patient(age=45, patient_id=1223)),
          (3,
           Objective(patient_id=5223, appointment_id=2, code='42631002', code_system='SNOMEDCT')),
          (4,
           Objective(patient_id=5223, appointment_id=2, code='82014009', code_system='SNOMEDCT')),
          (5,
           Objective(patient_id=5223, appointment_id=2, code='99999984', code_system='SNOMEDCT')),
          (6,
           PolymorphousRash(patient_id=5223, appointment_id=2, weight=0, present=False)),
          (7,
           PeripheralEdema(patient_id=5223, appointment_id=2, weight=1, present=True)),
          (8,
           StrawberryTongue(patient_id=5223, appointment_id=2, weight=1, present=True)),
          (9, Excluded(patient_id=1223))])

It seems like EXISTS is critical for "aggregating" information from multiple facts. If this is not the intended behavior of EXISTS, is there something else I should use? I need to be able to reason on the largest collection of facts in a set that match.

Let me know if there is a different approach I should use.

amsimms avatar Sep 24 '19 18:09 amsimms

Hi,

I am a little confused about the provided code. The patterns inside the EXISTS CE of your rule (FeverGreaterThanFourDays, ConjunctivalInjection) don't seems to match with the FactLists that you provided.

Some things that seems weird to me are:

  • Your rule lacks the self parameter.
  • You are trying to capture as parameters patient_id and appointment_id. This is not possible with EXISTS given that the rule will fire only once if there are facts that satisfy the pattern (even if there are more than one possibility).
  • Replacing EXISTS with OR for testing is not a good idea, a better replacement would be AND. Because all of the patterns inside EXISTS must match for the rule to fire.
  • EXISTS is not for "aggregating" information is for checking if a set of patterns match at least once.

I prepared this example to illustrate the use of EXISTS.

If you can provide a more concise example of what you are trying to achieve it would be very helpful.

nilp0inter avatar Oct 18 '19 22:10 nilp0inter

Here is a more detailed use case. I'm looking at producing a preliminary and final risk score. The preliminary risk score is calculated from available data, not complete data, and should be produced using as many of the defined criteria that are available.

A final score, in contrast, must have all defined criteria set to either present or absent.

For the sake of argument, consider a score function that includes 4 observation criteria: o1, o2, o3, o4. Each observation can be modeled as fact with attribute "present" that is True if the observed criteria is present, or false if is not. If a given observation, e.g. fever, has not been evaluated, no observation fact for fever is declared.

Both a preliminary score and a final score can be calculated when all four facts (o1, o2, o3, o4) are declared in the knowledge base, as in this example:

o1 - present o2 - present o3 - absent o4 - present final score = 3, and preliminary score = 3

Then consider this case where only two observation facts are declared in the KB: o1 - present o4 - present

Here a final score cannot be calculated as all four observations have not been explicitly evaluated as either present or absent. However, there is enough information to calculate a preliminary score of 2, because o1 and o4 have been set to present (i.e. True).

DECLARE seems like the right way to define a rule to calculate the preliminary score. It would be written as:

DECLARE(o1, o2, o3, o4)

and should match as many facts as possible, binding attributes appropriately.

AND(o1, o2, o3, o4)

is what I am using for the final score.

Let me know if this clears things up.

Thanks!

amsimms avatar Oct 24 '19 04:10 amsimms

I think I understand your needs. I prepared this proof of concept about the MAYBE operator.

Please, tell me if this helps.

from experta import *


def MAYBE(*patterns):
    return AND(
        *[OR(p, NOT(p)) for p in patterns]
    )


class FeverGreaterThanFourDays(Fact):
    pass


class ConjunctivalInjection(Fact):
    pass


class PolymorphousRash(Fact):
    pass


class PeripheralEdema(Fact):
    pass


class KE(KnowledgeEngine):
    @Rule(
        MAYBE(
            AS.f1 << FeverGreaterThanFourDays(
                patient_id=MATCH.patient_id,
                appointment_id=MATCH.appointment_id,
                weight=MATCH.fever_weight,
                present=W(),
            ),
            AS.f2 << ConjunctivalInjection(
                patient_id=MATCH.patient_id,
                appointment_id=MATCH.appointment_id,
                weight=MATCH.conjunctival_weight,
                present=W(),
            ),
            AS.f3 << PolymorphousRash(
                patient_id=MATCH.patient_id,
                appointment_id=MATCH.appointment_id,
                weight=MATCH.polymorphous_weight,
                present=W(),
            ),
            AS.f4 << PeripheralEdema(
                patient_id=MATCH.patient_id,
                appointment_id=MATCH.appointment_id,
                weight=MATCH.peripheral_weight,
                present=W(),
            )
        )
    )
    def something(self, f1=None, f2=None, f3=None, f4=None):
        print("Score:", len(list(filter(None, [f1, f2, f3, f4]))))

k=KE()
k.reset()
k.declare(ConjunctivalInjection(patient_id=1, appointment_id=2, weight=15, present=True))
k.declare(FeverGreaterThanFourDays(patient_id=1, appointment_id=2, weight=12, present=True))
k.declare(PolymorphousRash(patient_id=1, appointment_id=2, weight=0, present=False))
# k.declare(PeripheralEdema(patient_id=1, appointment_id=2, weight=1, present=True))
k.run()

nilp0inter avatar Nov 17 '19 00:11 nilp0inter

Hi, I have problem with MATCH too. When EXISTS use MATCH.some_value the result is always True and that is not correct. Here is simple example to reproduce

from experta import *
class FactOne(Fact):
    name = Field(str)

class FactTwo(Fact):
    name = Field(str)

class MyEngine(KnowledgeEngine):

    @DefFacts()
    def load(self):
        yield FactOne(name = 'aaaaaaaaa')
        yield FactTwo(name = 'bbbbbbbbb')

    @Rule(
        FactOne(name = MATCH.name),
        EXISTS(
            FactTwo(name = MATCH.name)
        )
    )
    def my_rule(self, name):
        print('my_ryle is activated, name: {0}'.format(name))


e = MyEngine()

than I get

>>> e.reset()
>>> print(e.facts)
<f-0>: InitialFact()
<f-1>: FactOne(name='aaaaaaaaa')
<f-2>: FactTwo(name='bbbbbbbbb')

but also

>>> print(e.agenda)
0: my_rule {InitialFact(), FactOne(name='aaaaaaaaa')}
>>> e.run()
my_ryle is activated, name: aaaaaaaaa

This is not expected behavior (at least it shouldn't be)

lazarkrstic avatar Nov 27 '20 17:11 lazarkrstic