janusgraph
janusgraph copied to clipboard
textcontains with label constrict doesn't work
when I use g.V().has('name', textContains('a')) or g.V().hasLabel('company'),they all works well ,but if I combine these two conditions, I always get no results. for example g.V().hasLabel('company').has('name', textContains('a')) . although all my vetex labels are company, I got zero count in result. Could someone helps me out please ?Since it's really very important for me
Are you using partitioned labels? In that case this looks like a duplicate of #1842 where you also already commented.
Actually ,I am using the configuration docker-compose-cql-es.yml same with here: https://github.com/JanusGraph/janusgraph-docker
My question was whether you have created the vertex label company
like this:
mgmt.makeVertexLabel('company').partition().make()
as #1842 mentions a problem with vertex label constraints if the label is partitioned.
Apart from that, is the problem specific to text predicates or does it also occur if you use a simple has
step? So something like this:
g.V().hasLabel('company', 'name', 'test-company')
@FlorianHockmann I have created the vertex label company using gremlin.net like this:
var company= g.AddV("company").Property("name", 'IBM').Property("code", "010101").Next()
and it is normal when I combine hasLabel and has filter, and it is also normal if I use containsTextPrefix('I') ,but when comes to textContains,it just doesn't work.
textContains('I') doesn't work, textContains('IB') doesn't work, unless textContains('IBM') can find one result with name IBM,but actually there are companies naming 'IBMan','IBManufacture' in my database.
Hi @ChenZhaobin That's intended behavior. Have a look at the docs. It says:
textContains
: is true if (at least) one word inside the text string matches the query string
So keep in mind textContains
matches full words, not arbitrary substrings. That's why 'IBM'
is found but 'IBMan'
is not found. If you had an entry like 'IBM Manufacture
', it would be found.
@rngcntr nope,maybe above is not a good example, actually my field is composed of multiple chinese characters,whose every word can be analysed to a string using ik analyzer,which is used as a plugin in elasticsearch.
it is normal when using
g.V().has('name', textContains('one or more chinese character')) ;
but the list result is null when using
g.V().hasLabel('company').has('name', textContains('one or more chinese character'))
which combines indexed field and the label filter
@rngcntr @FlorianHockmann finally,I solved this issue by below query:
g.V().hasLabel('company').filter{it.get().property('name').value().contains('one or more chinese character')}
Nice to see your solution @ChenZhaobin!
But I think the issue should stay open because the use of hasLabel
should not impact the functionality of textContains
.
@rngcntr reopened it, guess this is an issue related with mix index using es and other than default analyzer
this issue has nothing to do with custom analyzer, it is same as below tickets: https://github.com/JanusGraph/janusgraph/issues/1788 https://github.com/JanusGraph/janusgraph/issues/1379
@FlorianHockmann @porunov @pluradj do we have solution or plan for this?
The problem ist that textContains does not (as the name implies) searches for a substring, instead it searches for a word! What does that mean? Well the value gets tokenized and then the value will be searched for the searchterm with space in front and after it. Completly bad documented. And the worst: There is no alternative to search for a substring
@Zonkodonko: Yes, textContains
searches for words. That was however already described above.
How is that poorly documented? The docs state first that:
Text search predicates which match against the individual words inside a text string after it has been tokenized
and then for textContains
:
is true if (at least) one word inside the text string matches the query string
(emphasize added)
If you think that the docs could be improved on this, then please open a new issue.
This issue is about the problem that:
the use of hasLabel should not impact the functionality of textContains.
which is also described in the issue description itself.
@FlorianHockmann You are right. Somehow I didn't catch this while reading the documentation. It's just that the wording of the methode and of the documentation is really confusing. Especially if you read the gremlin documentation before and think you know how it is supposed to work. Sorry for that.