cartography icon indicating copy to clipboard operation
cartography copied to clipboard

Blind Search

Open steve-solun opened this issue 2 years ago • 5 comments

Hi @achantavy, I have finally configured cartography and got access to the my clouds graph. I have seen you create an index per label mainly on the id fields.

I am trying to query the graph using a blind search, for example:

MATCH (n) WHERE ANY(x in keys(n) WHERE n[x] =~ '(i?){some expression goes here}*') RETURN n

Meaning that given a word/expression, I want to go over all nodes and relations and their properties and if I find my pattern, return the relevant subgraph. Is there any best practice doing it?

steve-solun avatar Aug 21 '22 15:08 steve-solun

I have finally configured cartography and got access to the my clouds graph. I have seen you create an index per label mainly on the id fields.

Awesome!

Meaning that given a word/expression, I want to go over all nodes and relations and their properties and if I find my pattern, return the relevant subgraph. Is there any best practice doing it?

I'd recommend against doing a query that doesn't have labels because if you have a large enough graph (as low as more than ~100k nodes), queries will run extremely slowly. For best performance, it's best to query with node label + an indexed field (although if the field is not indexed you'll probably be okay unless your graph is like in the millions and you have a special use case).

achantavy avatar Aug 25 '22 03:08 achantavy

I totally understand your point @achantavy, but what if I don't know the parameter? I am looking for some expression/pattern and my user doens't really know what is the label and parameter. What would be the best practice in such case from your experience?

steve-solun avatar Aug 25 '22 08:08 steve-solun

Since your query example is a regular expression, I'm assuming you're wishing to match against strings only.

This could possibly be a use case for Neo4J full-text search (FTS). I have not touched Neo4J FTS in some time, but it looks to be better supported now and easier to set up and use. Take a look at the documentation: https://neo4j.com/docs/cypher-manual/current/indexes-for-full-text-search/

IIRC, the biggest issue I had with Neo4J FTS is that it created a very large index that bloated the database size substantially. I do not know the performance implications of adding/removing nodes that are related to an FTS index. It could be that a CREATE or DELETE executes more slowly when part of an FTS index.

trodery avatar Sep 28 '22 21:09 trodery

@trodery thanks for your answer, did you manage to find a best practice how to use it? Can you please share an example?

steve-solun avatar Oct 03 '22 09:10 steve-solun

@steve-solun I don't currently have a use case for FTS, so I haven't messed with it. I'm only going based upon my previous experience with Neo4J FTS which is a few years old at this point, and the current documentation.

The first entry from this bit of documentation should get you going on the indexing part. https://neo4j.com/docs/cypher-manual/current/indexes-for-full-text-search/#administration-indexes-fulltext-search-create-and-configure

Examples from the documentation:

Creating an FTS index named titlesAndDescriptions for the title and description properties of both Movie and Book labels.

CREATE FULLTEXT INDEX titlesAndDescriptions FOR (n:Movie|Book) ON EACH [n.title, n.description]

Querying the FTS index named titlesAndDescriptions for the text matrix

CALL db.index.fulltext.queryNodes("titlesAndDescriptions", "matrix") YIELD node, score
RETURN node.title, node.description, score

So you'd do something similar where you'd need to create one or more indexes for the cartography labels and properties you are interested in. Then it looks like it's fairly easy to query across those labels and properties using the above query example.

trodery avatar Oct 03 '22 15:10 trodery