cartography
cartography copied to clipboard
Blind Search
Hi @achantavy, I have finally configured cartography and got access to the my clouds graph. I have seen you create an index per label mainly on the id fields.
I am trying to query the graph using a blind search, for example:
MATCH (n) WHERE ANY(x in keys(n) WHERE n[x] =~ '(i?){some expression goes here}*') RETURN n
Meaning that given a word/expression, I want to go over all nodes and relations and their properties and if I find my pattern, return the relevant subgraph. Is there any best practice doing it?
I have finally configured cartography and got access to the my clouds graph. I have seen you create an index per label mainly on the id fields.
Awesome!
Meaning that given a word/expression, I want to go over all nodes and relations and their properties and if I find my pattern, return the relevant subgraph. Is there any best practice doing it?
I'd recommend against doing a query that doesn't have labels because if you have a large enough graph (as low as more than ~100k nodes), queries will run extremely slowly. For best performance, it's best to query with node label + an indexed field (although if the field is not indexed you'll probably be okay unless your graph is like in the millions and you have a special use case).
I totally understand your point @achantavy, but what if I don't know the parameter? I am looking for some expression/pattern and my user doens't really know what is the label and parameter. What would be the best practice in such case from your experience?
Since your query example is a regular expression, I'm assuming you're wishing to match against strings only.
This could possibly be a use case for Neo4J full-text search (FTS). I have not touched Neo4J FTS in some time, but it looks to be better supported now and easier to set up and use. Take a look at the documentation: https://neo4j.com/docs/cypher-manual/current/indexes-for-full-text-search/
IIRC, the biggest issue I had with Neo4J FTS is that it created a very large index that bloated the database size substantially. I do not know the performance implications of adding/removing nodes that are related to an FTS index. It could be that a CREATE
or DELETE
executes more slowly when part of an FTS index.
@trodery thanks for your answer, did you manage to find a best practice how to use it? Can you please share an example?
@steve-solun I don't currently have a use case for FTS, so I haven't messed with it. I'm only going based upon my previous experience with Neo4J FTS which is a few years old at this point, and the current documentation.
The first entry from this bit of documentation should get you going on the indexing part. https://neo4j.com/docs/cypher-manual/current/indexes-for-full-text-search/#administration-indexes-fulltext-search-create-and-configure
Examples from the documentation:
Creating an FTS index named titlesAndDescriptions
for the title
and description
properties of both Movie
and Book
labels.
CREATE FULLTEXT INDEX titlesAndDescriptions FOR (n:Movie|Book) ON EACH [n.title, n.description]
Querying the FTS index named titlesAndDescriptions
for the text matrix
CALL db.index.fulltext.queryNodes("titlesAndDescriptions", "matrix") YIELD node, score
RETURN node.title, node.description, score
So you'd do something similar where you'd need to create one or more indexes for the cartography labels and properties you are interested in. Then it looks like it's fairly easy to query across those labels and properties using the above query example.