pyorient icon indicating copy to clipboard operation
pyorient copied to clipboard

Contains on embedded data types may be broken, and it's unclear how it's supposed to work

Open obi1kenobi opened this issue 10 years ago • 9 comments

There don't seem to be any tests that use the .contains method on an embedded data property (EmbeddedSet etc.) and I can't figure out how to use it without getting weird errors. Consider the following code:

BaseNode = declarative_node()

class Person(BaseNode):
    element_type = 'Person'
    element_plural = 'Person'

    name = String(nullable=False)
    aliases = EmbeddedSet(nullable=False)

class ModelsTest(unittest.TestCase):
    def setUp(self):
        self.graph = get_test_orientdb_graph()
        self.graph.create_all(BaseNode.registry)

    def test_get_person_by_name(self):
        person1 = self.graph.Person.create(name='Hello World',
                                           aliases=set(['Hello T. World', 'World']))
        name = 'Hello World'
        result = self.graph.query(Person).filter(
            (Person.name == name) | Person.aliases.contains(name)).all()
        self.assertSetEqual(set(result), set([person1]))

This fails with the following trace:

.../pyorient/ogm/query.py:145: in prepare
    wheres = rid_clause + self.build_wheres(params)
.../pyorient/ogm/query.py:411: in build_wheres
    exp_where = [self.filter_string(filter_exp)] if filter_exp else []
.../pyorient/ogm/query.py:339: in filter_string
    , self.filter_string(right))
.../pyorient/ogm/query.py:320: in filter_string
    left_str, self.filter_string(right))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <pyorient.ogm.query.Query object>, expression_root = 'Hello World'

    def filter_string(self, expression_root):
>       op = expression_root.operator
E       AttributeError: 'str' object has no attribute 'operator'

Specifically, this line seems to be the problem, and I don't understand why it recursively calls filter_string on the right-hand side. Can the RHS be an expression? I would have expected it to always be a variable, because I don't see how one of these data structures could be tested for containing the result of an expression.

obi1kenobi avatar Nov 06 '15 19:11 obi1kenobi

It was written following: http://orientdb.com/docs/2.0/orientdb.wiki/SQL-Where.html

... where 'contains' takes a conditional expression.

I've never used - and therefore don't really understand - the semantics of embedded types in OrientDB (nor, for that matter, linked class types), so I'm not sure how much I'll be able to help, except to naively guess that maybe 'contains' should only be callable on LinkedClassProperty types and to allow a different 'in' syntax for embedded data?

TropicalPenguin avatar Nov 07 '15 08:11 TropicalPenguin

Looks like it's impossible to overload python's 'in' keyword: __contains__ return value is always coerced to bool, and overloading __bool__ to return a non-bool will trigger a TypeError :(

TropicalPenguin avatar Nov 07 '15 09:11 TropicalPenguin

From the description of CONTAINS in the same document you cited: "Condition can be a single item: in this case the behaviour is like the IN operator"

This definitely means the pyorient behavior is a bug. I also don't see an IN operator in pyorient's operators file.

Any thoughts on how best to proceed?

obi1kenobi avatar Nov 07 '15 17:11 obi1kenobi

Hm, for some reason I didn't notice that bit.

This latest commit (https://github.com/OuterPlaypen/pyorient/commit/69475faab91c0bb747d42e64771629b3ae8f3191) is my proposal: since it behaves like IN, and IN looks like it was featured in earlier versions than CONTAINS, we might as well use it - and as a happy coincidence this gets around needing to define some ugly reverse-polish style in_(<value>,<collection>) notation.

TropicalPenguin avatar Nov 07 '15 20:11 TropicalPenguin

I don't have strong opinions on how this is implemented under the hood, so long as it works -- and it would be nice to have some test coverage to make sure it actually stays working in the future.

That said, it looks a little strange that a 'contains' is being rewritten with 'in' on the wire, when 'contains' was added before the 1.0 release. I don't think pyorient supports any version that early, and the code may be simpler if both paths use 'contains'.

obi1kenobi avatar Nov 07 '15 20:11 obi1kenobi

Scratch that, please use in instead of the supposedly equivalent contains -- just checked on OrientDB 2.1.3 and they do not behave identically. in returns the correct result while contains returns no elements, when querying a FULLTEXT indexed field, even when it's the only predicate clause.

obi1kenobi avatar Nov 07 '15 20:11 obi1kenobi

Another update: contains(X) does not work when X is a primitive, but contains X does.

obi1kenobi avatar Nov 08 '15 20:11 obi1kenobi

Can i close?

Ostico avatar Apr 10 '16 23:04 Ostico

I think this is still at least a documentation issue -- it's unclear to me whether the contains method in pyorient is allowed to take a collection type or not, and whether that will be correctly handled by the code. So I'd say it's still an open issue, but it's of course your decision in the end.

obi1kenobi avatar Apr 11 '16 15:04 obi1kenobi