pygraphistry icon indicating copy to clipboard operation
pygraphistry copied to clipboard

[FEA] predicates dont translate from json to python

Open mj3cheun opened this issue 1 year ago • 6 comments

import graphistry
from graphistry import n, e_forward, is_in

chain_operations = graphistry.Chain([n(name="is_carib_bank_origin"),e_forward(hops=1, edge_match={"originator_bank_country": is_in(options=["Cayman Islands", "Bermuda", "Virgin Islands British", "British Virgin Islands", "Bahamas", "Panama", "Barbados"])}),
])

graphistry.Chain.from_json(chain_operations.to_json())

this doesnt work

mj3cheun avatar Apr 26 '24 22:04 mj3cheun

Is it the to_json or from_json side that's problematic?

lmeyerov avatar Apr 26 '24 23:04 lmeyerov

from_json, to_json is fine

mj3cheun avatar Apr 27 '24 04:04 mj3cheun

Yep, and I think this matters for the Graphistry endpoint especially (cc @aucahuasi )

to_json() looks good

chain_operations = graphistry.Chain([
    n(name="is_carib_bank_origin"),
    e_forward(hops=1, edge_match={"originator_bank_country": is_in(options=["Cayman Islands", "Bermuda", "Virgin Islands British", "British Virgin Islands", "Bahamas", "Panama", "Barbados"])}),
])
chain_operations.to_json()

=>

{"type": "Chain",
 "chain": [{"type": "Node", "filter_dict": {}, "name": "is_carib_bank_origin"},
  {"type": "Edge",
   "hops": 1,
   "to_fixed_point": false,
   "direction": "forward",
   "edge_match": {"originator_bank_country": {"type": "IsIn",
     "options": ["Cayman Islands",
      "Bermuda",
      "Virgin Islands British",
      "British Virgin Islands",
      "Bahamas",
      "Panama",
      "Barbados"]}}}]}

from_json() mishandles predicates

from graphistry.compute.chain import Chain
Chain.from_json( chain_operations.to_json() )

=>

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-12-9b8808bc1c18>](https://localhost:8080/#) in <cell line: 3>()
      1 from graphistry.compute.chain import Chain
      2 
----> 3 Chain.from_json( chain_operations.to_json() )

6 frames
[/usr/local/lib/python3.10/dist-packages/graphistry/compute/chain.py](https://localhost:8080/#) in from_json(cls, d)
     35         assert 'chain' in d
     36         assert isinstance(d['chain'], list)
---> 37         out = cls([ASTObject_from_json(op) for op in d['chain']])
     38         out.validate()
     39         return out

[/usr/local/lib/python3.10/dist-packages/graphistry/compute/chain.py](https://localhost:8080/#) in <listcomp>(.0)
     35         assert 'chain' in d
     36         assert isinstance(d['chain'], list)
---> 37         out = cls([ASTObject_from_json(op) for op in d['chain']])
     38         out.validate()
     39         return out

[/usr/local/lib/python3.10/dist-packages/graphistry/compute/ast.py](https://localhost:8080/#) in from_json(o)
    471         out = ASTNode.from_json(o)
    472     elif o['type'] == 'Edge':
--> 473         out = ASTEdge.from_json(o)
    474     else:
    475         raise ValueError(f'Unknown type {o["type"]}')

[/usr/local/lib/python3.10/dist-packages/graphistry/compute/ast.py](https://localhost:8080/#) in from_json(cls, d)
    296         out = ASTEdge(
    297             direction=d['direction'] if 'direction' in d else None,
--> 298             edge_match=maybe_filter_dict_from_json(d, 'edge_match'),
    299             hops=d['hops'] if 'hops' in d else None,
    300             to_fixed_point=d['to_fixed_point'] if 'to_fixed_point' in d else None,

[/usr/local/lib/python3.10/dist-packages/graphistry/compute/ast.py](https://localhost:8080/#) in maybe_filter_dict_from_json(d, key)
    100         return None
    101     if key in d and isinstance(d[key], dict):
--> 102         return {
    103             k: ASTPredicate.from_json(v) if isinstance(v, dict) else v
    104             for k, v in d[key].items()

[/usr/local/lib/python3.10/dist-packages/graphistry/compute/ast.py](https://localhost:8080/#) in <dictcomp>(.0)
    101     if key in d and isinstance(d[key], dict):
    102         return {
--> 103             k: ASTPredicate.from_json(v) if isinstance(v, dict) else v
    104             for k, v in d[key].items()
    105         }

[/usr/local/lib/python3.10/dist-packages/graphistry/compute/ASTSerializable.py](https://localhost:8080/#) in from_json(cls, d)
     38         """
     39         constructor_args = {k: v for k, v in d.items() if k not in cls.reserved_fields}
---> 40         return cls(**constructor_args)

TypeError: ASTPredicate() takes no arguments

lmeyerov avatar Apr 27 '24 17:04 lmeyerov

Relevant:

  • Deserializers call maybe_filter_dict_from_json() for entity attribute handling, which calls ASTPredicate.from_json(v) if isinstance(v, dict) else v: https://github.com/graphistry/pygraphistry/blob/4cc316b570e10edadce391ce490228dff7361689/graphistry/compute/ast.py#L98
  • ASTPredicate.from_json(v) is not implemented, so inherits ASTSerializable.from_json(v):
    • https://github.com/graphistry/pygraphistry/blob/4cc316b570e10edadce391ce490228dff7361689/graphistry/compute/predicates/ASTPredicate.py#L14
    • https://github.com/graphistry/pygraphistry/blob/4cc316b570e10edadce391ce490228dff7361689/graphistry/compute/ASTSerializable.py#L33

The method is:

class ASTSerializable(ABC):

    ...

    @classmethod
    def from_json(cls, d: Dict[str, JSONVal]) -> 'ASTSerializable':
        """
        Given c.to_json(), hydrate back c

        Corresponding c.__class__.__init__ must accept all non-reserved instance fields
        """
        constructor_args = {k: v for k, v in d.items() if k not in cls.reserved_fields}
        return cls(**constructor_args)

So maybe maybe_filter_dict_from_json needs to call from_json(IsIn, the_json) in the above, vs just from_json(the_json)

AFAICT we don't collect the set of dispatchable Predicate types anywhere (IsIn, GT, ...). The most we do is export the names. Maybe we should add compose/predicates/serialization.py, or something more in compute/ast.py, to handle these, and then maybe_filter_dict_from_json checks on those, or a more generic from_json somewhere does?

lmeyerov avatar Apr 27 '24 18:04 lmeyerov

Also, looks like our tests never exercised the need to find the cls object, instead baked in or didn't do the predicate case:

  • https://github.com/graphistry/pygraphistry/blob/master/graphistry/tests/compute/predicates/test_categorical.py
  • https://github.com/graphistry/pygraphistry/blob/master/graphistry/tests/compute/predicates/test_from_json.py
  • https://github.com/graphistry/pygraphistry/blob/master/graphistry/tests/compute/test_ast.py

lmeyerov avatar Apr 27 '24 18:04 lmeyerov

So maybe:

  • predicates/* export its ASTPredicate members, PREDICATES
  • deserialization called somewhere downstream of maybe_filter_dict_from_json should dispatch using these
  • validators should flag for using unknown predicates

lmeyerov avatar Apr 27 '24 18:04 lmeyerov