nebula
nebula copied to clipboard
feat: fn is_inversed, equivalent to `typeid(e) < 0`
this is handy to check if an edge is scanned from its dst end or not.
What type of PR is this?
- [ ] bug
- [x] feature
- [ ] enhancement
What problem(s) does this PR solve?
Issue(s) number:
Description:
The same edge could be treated as two because we persist one instance of edge in two places, see:
(root@nebula) [basketballplayer]>
MATCH (v1:player{name:"Tim Duncan"})-[e]-(v2:player{name:"Tony Parker"})
RETURN e
+----------------------------------------------------+
| e |
+----------------------------------------------------+
| [:follow "player100"->"player101" @0 {degree: 95}] |
| [:follow "player101"->"player100" @0 {degree: 95}] |
+----------------------------------------------------+
Got 2 rows (time spent 5.819ms/105.497167ms)
With is_inversed, we could detect those edges that are not expected in many cases.
How do you solve it?
Just yet another function returns bool: typeid(e) < 0
.
Special notes for your reviewer, ex. impact of this fix, design document, etc:
Document is needed.
Checklist:
Tests:
- [x] Unit test(positive and negative cases)
- [x] Function test
- [ ] Performance test
- [ ] N/A
Affects:
- [x] Documentation affected (Please add the label if documentation needs to be modified.)
- [ ] Incompatibility (If it breaks the compatibility, please describe it and add the label.)
- [ ] If it's needed to cherry-pick (If cherry-pick to some branches is required, please label the destination version(s).)
- [ ] Performance impacted: Consumes more CPU/Memory
Release notes:
Please confirm whether to be reflected in release notes and how to describe:
ex. Fixed the bug .....
@yixinglu , any chance we could bypass this lint error, please?
.linters/cpp/cpplint.py src/common/function/FunctionManager.cpp 2>&1 | grep error
src/common/function/FunctionManager.cpp:2047: Small and focused functions are preferred: FunctionManager::FunctionManager() has 1521 non-comment lines (error triggered by exceeding 500 lines). [readability/fn_size] [2]
@yixinglu , any chance we could bypass this lint error, please?
.linters/cpp/cpplint.py src/common/function/FunctionManager.cpp 2>&1 | grep error src/common/function/FunctionManager.cpp:2047: Small and focused functions are preferred: FunctionManager::FunctionManager() has 1521 non-comment lines (error triggered by exceeding 500 lines). [readability/fn_size] [2]
Maybe you could bypass it by splitting the function into some small functions, or disabling the 'readability/fn_size' rule in linters/cpp/cpplint.py
The failure now seems to be related to the instability of the test cases themself.
A possible approach is to use both the node id and the edge type id to determine the edge direction. The syntax might look like this:
MATCH (v1:A)-[e]-(v2:B) RETURN v1, is_source_of(v1, e) AS edge_direction, v2
Minor suggestion: It is better to commit the refactoring-related codes to another pr, which is also more review friendly.
IMHO, the edge type id is an implementation level concept and cannot be used to simply indicate query semantics.
A possible approach is to use both the node id and the edge type id to determine the edge direction. The syntax might look like this:
MATCH (v1:A)-[e]-(v2:B) RETURN v1, is_source_of(v1, e) AS edge_direction, v2
Yeah, agreed, mixing thing here is a little bit twisted, the way we treated e
in MATCH (v1:player{name:"Tim Duncan"})-[e]-(v2:player{name:"Tony Parker"}) RETURN e
as two instances, it's kind of implementation exposure twisted with graph semantics, too? So we need such functions as a mitigation on top of that implementation?
is_source_of(v, e)
is better and is a more scrupulous semantic definition, but sometimes we just don't have the v in the pattern to be referred :(.
Maybe we could add both is_source_of(v_or_vid, e) and this function?
Maybe we come out with a name that's more scrupulous than is_inversed?
Minor suggestion: It is better to commit the refactoring-related codes to another pr, which is also more review-friendly.
Agreed! I'll create a separate PR for the refactor to be merged before this change!
is_source_of(v, e)
is better and is a more scrupulous semantic definition, but sometimes we just don't have the v in the pattern to be referred :(. Maybe we could add both is_source_of(v_or_vid, e) and this function? Maybe we come out with a name that's more scrupulous than is_inversed?
I dont think you can judge the edge direction simply by the sign of edge type id.
is_source_of(v, e)
is better and is a more scrupulous semantic definition, but sometimes we just don't have the v in the pattern to be referred :(. Maybe we could add both is_source_of(v_or_vid, e) and this function? Maybe we come out with a name that's more scrupulous than is_inversed?I dont think you can judge the edge direction simply by the sign of edge type id.
This is how I may need this sugar, here
WITH map{`true`: "-[", `false`: "<-["} AS arrow_l,
map{`true`: "]->", `false`: "]-"} AS arrow_r
MATCH (s)-[e:follow*..2]-() WHERE id(s) IN ["player100", "player101"]
WITH id(s) AS subj, [rel in e | [
arrow_l[tostring(typeid(rel) > 0)] +
tostring(rel.degree)+
arrow_r[tostring(typeid(rel) > 0)],
CASE typeid(rel) > 0
WHEN true THEN dst(rel)
WHEN false THEN src(rel)
END
]
] AS rels
WITH
subj,
REDUCE(acc = collect(NULL), l in rels | acc + l) AS flattened_rels
RETURN
subj,
REDUCE(acc = subj,l in flattened_rels|acc + ', ' + l) AS flattened_rels
Where do I need to construct a knowledge sequence from path/edges, without getting the sign of typeid, there were some bad cases.
This is how I may need this sugar, here
WITH map{`true`: "-[", `false`: "<-["} AS arrow_l, map{`true`: "]->", `false`: "]-"} AS arrow_r MATCH (s)-[e:follow*..2]-() WHERE id(s) IN ["player100", "player101"] WITH id(s) AS subj, [rel in e | [ arrow_l[tostring(typeid(rel) > 0)] + tostring(rel.degree)+ arrow_r[tostring(typeid(rel) > 0)], CASE typeid(rel) > 0 WHEN true THEN dst(rel) WHEN false THEN src(rel) END ] ] AS rels WITH subj, REDUCE(acc = collect(NULL), l in rels | acc + l) AS flattened_rels RETURN subj, REDUCE(acc = subj,l in flattened_rels|acc + ', ' + l) AS flattened_rels
Where do I need to construct a knowledge sequence from path/edges, without getting the sign of typeid, there were some bad cases.
How about that(probably be better performance):
MATCH p=(s)-[e:follow*..2]-() WHERE id(s) IN ["player100", "player101"]
RETURN id(s) AS subj, pathToString(p)
std::string pathToString(Path p, std::vector<std::string> propNames={}) {
...
}
This is how I may need this sugar, here
WITH map{`true`: "-[", `false`: "<-["} AS arrow_l, map{`true`: "]->", `false`: "]-"} AS arrow_r MATCH (s)-[e:follow*..2]-() WHERE id(s) IN ["player100", "player101"] WITH id(s) AS subj, [rel in e | [ arrow_l[tostring(typeid(rel) > 0)] + tostring(rel.degree)+ arrow_r[tostring(typeid(rel) > 0)], CASE typeid(rel) > 0 WHEN true THEN dst(rel) WHEN false THEN src(rel) END ] ] AS rels WITH subj, REDUCE(acc = collect(NULL), l in rels | acc + l) AS flattened_rels RETURN subj, REDUCE(acc = subj,l in flattened_rels|acc + ', ' + l) AS flattened_rels
Where do I need to construct a knowledge sequence from path/edges, without getting the sign of typeid, there were some bad cases.
How about that(probably be better performance):
MATCH p=(s)-[e:follow*..2]-() WHERE id(s) IN ["player100", "player101"] RETURN id(s) AS subj, pathToString(p)
std::string pathToString(Path p, std::vector<std::string> propNames={}) { ... }
make sense!
@czpmango I think I need to implement pathToString! But for now, I still consider is_invsersed helpful, it's a self-explanation for our such behavior as again hit by #5779 :-D What do you think?