Gaffer icon indicating copy to clipboard operation
Gaffer copied to clipboard

PageRank implementation

Open t616178 opened this issue 6 years ago • 3 comments

Provide an implementation of the PageRank algorithm using GraphFrames to demonstrate use of the GraphFrames API as a Gaffer operation.

t616178 avatar Oct 16 '17 15:10 t616178

Should revert to Graphframes 0.4.0 to see if using an old version of the library solves the performance issues found in 0.5.0.

t616178 avatar Nov 01 '17 17:11 t616178

We should create a new module within the library module: graph-analytic-library. The PageRank operation should go in there.

The PageRank operation should have a generic input type and a generic output type.

The PageRank operation should not have a dependency on Spark or GraphFrames, but allow an operation handler to implement it in anyway it wants to. This would cause the input/output types to be unpredictable. So, perhaps in addition to PageRank we need GraphFramePageRank so users can be sure what the input and output type should be.

Within the spark module we will also need to add a new module: spark-graph-analytic-library. Then add a GraphFramesPageRank operation in there that extends PageRank<GraphFrames, GraphFrames>. The handler for the graph frames operation can also go in this module. Perhaps we should implement 2 handlers one for PageRank and one for GraphFramesPageRank. The handler could do:

if(!(operation.getInput() instanceof GraphFrame) { // extract a graph frame first. }

Then a Gaffer system could have multiple versions of PageRank using different technologies all available at the same time via different operations. But, if a user doesn't care how the operation is implemented they could just use the top level PageRank operation.

p013570 avatar Dec 05 '17 17:12 p013570

This issue is basically ready to be merged in, except for an issue with performance that needs to be investigated.

p013570 avatar Jan 31 '18 08:01 p013570