GraphEngine
GraphEngine copied to clipboard
Trinity.TSL: move [Index] support out of codegen
Currently, [Index] attribute is interpreted by the TSL codegen, which will then generate an inverted index service for each of these fields. However, to make indexing more flexible, we should remove the index codegen, and migrate the indexing capabilities to an ICell-based framework, where indexer modules can claim their indexing capabilities, and arranged by our system. It is even possible that one field is indexed by multiple indexer modules claiming different capabilities (some good at range searches, some good at FTS, etc).
Hey @yatli does this enhancement manifest itself as a breaking-change? I am using index on RDF-Triple [s,p,o] Axioms and Object-values that reference Subject [TBox] and or Predicate Axioms [ABox]. I am using "struct" to model and represent object values resolution vs. object references projections where I need to reference the ICell-based framework. Also to have the object-reference in a triple-store representation accessible by other indexer module is great as differing search and inference strategies can be deployed; what type, if any, metadata does the GE compute engine generate.
@TaviTruman you can save the generated code and incorporate it with the tsl project (with the nuget package, we can add custom code to a tsl assembly and access the internal components). I feel like this is a better way to approach the index problem, as filling the indexer code into the codegen will bring in too many assumptions, which may or may not suit a user's need.
@yatli Okay that works for me.
@TaviTruman btw we can also adapt the current bigram as such a "index module". Are you using the raw query inteface (feed in string, get a list of cell ids) or the LINQ interface? The LINQ part could be a more general one and it could serve all index modules, and thus we can split this part out so that we can maintain the bigram index more easily. :)
I'm using the LINQ interface and yes your right regarding the more general use-case; I had a thought about that a few weeks ago as it makes sense for reuse while just passing in a expression as parametric control for indexing. Separating out away from Bigram indexing will make space for idiosyncratic optimizations and behavior enhancement and this is a good thing.
Moving this item to Future.