pra icon indicating copy to clipboard operation
pra copied to clipboard

Make remote FeatureGenerators

Open matt-gardner opened this issue 10 years ago • 2 comments

With large graphs (such as Freebase), it can take upwards of 10 minutes just to load the graph from disk and create the graph object. Maybe there are some things I can do in code to make that a bit quicker, but it still would be nice to only have it done once and be able to reuse a running graph server.

matt-gardner avatar Jan 05 '16 19:01 matt-gardner

As of 1/21/16, this actually works, but it's incredibly slow. I did a bunch of work trying to optimize graph loading, and the way the graph is stored in memory. So that should help a bit. There is one more optimization I want to try for loading the graph (store a binary file, and load that, instead of the GraphChi ascii version I currently use), but that doesn't help the server any. To make the server idea really feasible, I need to offload more of the computation to the graph, having the graph object compute paths and such. That would be a pretty big refactoring of the code, so I'm not going to do it any time soon.

matt-gardner avatar Jan 21 '16 21:01 matt-gardner

Actually, to implement what I want to do, I just need to make a FeatureGeneratorServer, instead of a GraphServer. You issue a query about a node pair to the server, it does the graph computation, and responds with a list of features. This is definitely feasible. I just don't have a particular need for it at the moment.

matt-gardner avatar May 18 '16 20:05 matt-gardner