Orleans.Indexing-1.5
Orleans.Indexing-1.5 copied to clipboard
Is it possible to extract the functionality into a nuget package?
I was wondering how much of the indexing functionality is in the 'core' orleans code, and how much could be extracted to a packge. It would make this functionality easier to consume if it was an 'optional extra' on top of the core orleans codebase.
Agreed. Maybe the dependent abstractions could be PR-ed into the main Orleans code?
/cc @mdashti
Was gonna ask for this just now.
I was going to look at it once I finish the new telemetry APIs.
The big challenge now is to understand the code and make it injectable thru facets and not depend on a base class.
@richorama There's only a small amount of change to the Orleans runtime. Almost all of it runs at app level.
@mdashti, can you please summarize these changes to the runtime?
@galvesribeiro, yes, it should be made injectable thru facets. We developed the indexing prototype over a year ago, before the decision was made (last January) to refactor the transaction mechanism using injectable facets.
@philbe Yep, I'll look at it
@richorama, thank you for bringing up this issue. As @ph1ll mentioned, we had this in our mind from the very beginning. These are three clean commits that should be included in the core:
- 72fd56b261bc542fe947c0436e5e915b4522c425
- d4df9507fafdb3cdb9e8cea876059b1ba68c5422
- e4c3479b4c9f8cb438a629611c36dab729504b39
The rest is completely separated into the OrleansIndexing sub-project.
FYI, my GitHub username is @philbe, not @ph1ll. It looks like this got started via an autocomplete error by @galvesribeiro.
Ops, my bad @philbe :)
@Arshia001 @richorama @creyke @galvesribeiro I have started working with a developer to port the indexing system to Orleans 2.0, to replace the base class by facets (i.e., dependency injection), and to obtain consistency between grains and indexes using transactions (in place of or in addition to the current workflow protocol).
@philbe If you are still progressing with Indexing, I can offer some time to help even if it is only writing tests, documentation, sample implementations
@jamiemitchellconsultants CC: @kirkolynyk, who is doing the port to Orleans 2.0 Thanks for the offer, but we're not far enough along for that -- still working on getting the code to run on 2.0. I'll report on the project's status next week.
I'm very interested in this project, let me know if you need any help. @philbe
CC: @TedHartMS (who has taken over responsibility the port to Orleans 2.0)
@KSemenenko, thanks very much for the offer. @TedHartMS is nearly done porting indexing to Orleans 2.0. After we publish that milestone, we can discuss how best you can help. Also @jamiemitchellconsultants, if you're still interested and available.
Next up, we'll replace the base class (IndexableGrain) by facets and then use transactions to obtain consistency between grains and indexes.
@philbe Yes I am interested in helping.
@philbe any updates?
@TedHartMS has finished porting the implementation to Orleans V2. He's currently doing a final round of tests, after which we'll publish it as a new project on GitHub. (There were too many changes to Orleans to rebase the current Orleans.Indexing on Orleans V2.) Sorry for the delay. It took longer than we expected because Orleans V2 doesn't expose all of the APIs that are needed by Orleans.Indexing, so we've been iterating with the Orleans team to agree on a minimal the number of changes that will eventually be needed to the Orleans runtime.
@philbe any updates?
@KSemenenko We have a version that runs on Orleans V2. I expect we'll be publishing it early next week.
@KSemenenko @jamiemitchellconsultants @Arshia001 @richorama @creyke @galvesribeiro @mdashti @
@TedHartMS has published a new version of Orleans.Indexing that works on Orleans V2. It’s a new repo, OrleansV2.Fork.Indexing. Given the amount of churn in Orleans and Orleans.Indexing, it didn’t make sense to go through the effort of rebasing Orleans.Indexing and then merging in the changes.
OrleansV2.Fork.Indexing is a fork of Orleans V2 because it depends on some changes to the Orleans runtime. We think these changes are not controversial and will be accepted as soon as Sergey’s team has time. After they’re in Orleans, we’ll create another repo, OrleansV2.Indexing that uses the Orleans NuGet, and then delete OrleansV2.Fork.Indexing.
There’s still one missing feature to be added: DirectStorageManagedIndex, which uses Document DB (Cosmos DB). @TedHartMS needs to submit a pull request for Orleans.CosmosDB to get it working. Next, we’ll be working on replacing the inheritance-based API (IIndexedGrain) by a facet-based interface, as was done for transactions, and then implement transactional indexing
Best news so far this week, thanks!
Thank you @philbe !! it's just amazing news!
Is there any news? when will it be available officially? @philbe
My question as well. I've been using my own (naive) index system based on buckets (also implememted as grains), but I'd love an official solution.
@Arshia001 @KSemenenko I have also been using my own naive index based on buckets implemented as grains... any appetite for sharing code ?
@jamiemitchellconsultants why not... Don't expect too much though, I didn't really put that much thought into the code; I've been hoping for an official release of indexing ever since.
It is a bit of a hatchet job...cutting out application logic... but my index grains are here https://github.com/jamiemitchellconsultants/Index
IndexInterface and IndexImplementation do the work.
Generally for a grain that implements a domain object.... e.g. Customer I have CustomerGrain, CustomerIndex and CustomerManager
There is quite a bit of boilerplate ...mostly generated by T4 templates (not included)
My next step is to serialise Linq so that a client can send a predicate to the grain
I just read through your code. You're splitting the data over many grains, but to get a value you're scanning all buckets. This means that to get a single value back from your index, you have to call all your grains (thus activating them), which essentially means all data in always kept in memory (remember, grains stay in memory after they're activated). Also, by scanning all buckets, you're doing an O(n) search (Each bucket performs an O(1) dictionary lookup, but you're limiting the number of entries in each, which means you get n / bucket size
operations which is still O(n)). This may work for a relatively small system, but when the system grows, you'll inevitably find that your indices are getting slower and slower, to the point where they'll crash the entire system. The only real value it provides is horizontally splitting the index data over many silos. An equality search with no further requirements can and should be done in O(1).
You also don't really need the CustomerIndex
, etc. grains. A static class could do the same job, without the overhead of a grain call (there is an overhead associated with grain calls, even if they're local).
I'll get my own code uploaded when I get back from work (that's in ~10 hours), and we can have a look at mine.
@KSemenenko @Arshia001 @jamiemitchellconsultants See my comment here
I know it's been 2 days and ~10 hours, I was having laptop problems... anyway, here's mine. There's unique and non-unique, and there's O(1) buckets.