OrleansDashboard icon indicating copy to clipboard operation
OrleansDashboard copied to clipboard

Display Grain interaction with WebGraphviz

Open KSemenenko opened this issue 2 years ago • 20 comments

I think it would be good to make diagrams of the real interaction between Grains.

And to display it, use a tool like this: http://www.webgraphviz.com Screenshot 2022-02-11 at 12 56 27

I already have a filter that collects data, but I can't do it in the front-end :( I will prepare a PR

KSemenenko avatar Feb 11 '22 11:02 KSemenenko

This looks like a great idea. It would be awesome to overlay call counts, exceptions, latency etc..

richorama avatar Feb 11 '22 12:02 richorama

something like the map you get in Application Insights image

richorama avatar Feb 11 '22 12:02 richorama

yes like this

KSemenenko avatar Feb 11 '22 12:02 KSemenenko

This would be amazing

SebastianStehle avatar Feb 11 '22 12:02 SebastianStehle

I made a PR, following @ReubenBond advice I use RequestContext and store the call stack in it. I think there are two parts to the task.

  1. Collect the call history, which I hope I did.
  2. Convert this data to some format for display
  3. A UI that will display this.

I would be grateful for any comments, as well as advice on how to properly write the tests.

I've seen a lot of test projects, but I don't understand how it all works yet. I will study this question

KSemenenko avatar Feb 11 '22 15:02 KSemenenko

you can check some results here: http://www.webgraphviz.com

digraph finite_state_machine {
rankdir=LR;
ratio = fill;
node [style=filled];
    IMembershipTable -> IMembershipTable_self [ label = "ReadAll", color="0.650 0.700 0.700" ];
    IMembershipTable -> IMembershipTable_self [ label = "InsertRow", color="0.650 0.700 0.700" ];
    IMembershipTable -> IMembershipTable_self [ label = "UpdateRow", color="0.650 0.700 0.700" ];
    IDeploymentLoadPublisher -> IDeploymentLoadPublisher_self [ label = "UpdateRuntimeStatistics", color="0.650 0.700 0.700" ];
    IClusterTypeManager -> IClusterTypeManager_self [ label = "GetClusterGrainTypeResolver", color="0.650 0.700 0.700" ];
    IClusterTypeManager -> IClusterTypeManager_self [ label = "GetImplicitStreamSubscriberTable", color="0.650 0.700 0.700" ];
    ITestHooks -> ITestHooks_self [ label = "GetApproximateSiloStatuses", color="0.650 0.700 0.700" ];
    IUserGrain -> IUserGrain_self [ label = "ExistsAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> ISessionGrain [ label = "RegisterAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> IWordsListGrain [ label = "CreateListAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> IUserGrain_self [ label = "CreateListAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> IWordsListGrain [ label = "GetListAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> IUserGrain_self [ label = "GetListAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> ITrainingGrain [ label = "StartTrainingAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> IUserGrain_self [ label = "StartTrainingAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> ITrainingGrain [ label = "UpdateTrainingStatusAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> IUserGrain_self [ label = "UpdateTrainingStatusAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> IUserGrain_self [ label = "RegisterAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> ITrainingGrain [ label = "FinishTrainingAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> IUserGrain_self [ label = "FinishTrainingAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> IUserGrain_self [ label = "GetTrainingAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> ITrainingGrain [ label = "GetTrainingAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> IUserGrain_self [ label = "CancelTrainingAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> ITrainingGrain [ label = "CancelTrainingAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> IUserGrain_self [ label = "PauseSessionAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> IUserGrain_self [ label = "ResumeSessionAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> ISessionGrain [ label = "LogoutAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> IUserGrain_self [ label = "LogoutAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> ISessionGrain [ label = "LoginAsync", color="0.650 0.700 0.700" ];
    IUserGrain -> IUserGrain_self [ label = "LoginAsync", color="0.650 0.700 0.700" ];
    ISessionGrain -> ISessionGrain_self [ label = "CreateSessionAsync", color="0.650 0.700 0.700" ];
    ISessionGrain -> ISessionGrain_self [ label = "ValidateAsync", color="0.650 0.700 0.700" ];
    ISessionGrain -> ISessionGrain_self [ label = "GetSessionAsync", color="0.650 0.700 0.700" ];
    ISessionGrain -> ISessionGrain_self [ label = "CloseSessionAsync", color="0.650 0.700 0.700" ];
    IWordsListGrain -> IWordsListGrain_self [ label = "CreateListAsync", color="0.650 0.700 0.700" ];
    IWordsListGrain -> IWordsListGrain_self [ label = "GetListAsync", color="0.650 0.700 0.700" ];
    IWordsListGrain -> IWordsListGrain_self [ label = "IncreaseWordListCounterAsync", color="0.650 0.700 0.700" ];
    IWordsListGrain -> IWordsListGrain_self [ label = "IncreaseWordCounterAsync", color="0.650 0.700 0.700" ];
    ITrainingGrain -> IWordsListGrain [ label = "StartTrainingAsync", color="0.650 0.700 0.700" ];
    ITrainingGrain -> ITrainingGrain_self [ label = "StartTrainingAsync", color="0.650 0.700 0.700" ];
    ITrainingGrain -> IWordsListGrain [ label = "UpdateTrainingStatusAsync", color="0.650 0.700 0.700" ];
    ITrainingGrain -> ITrainingGrain_self [ label = "UpdateTrainingStatusAsync", color="0.650 0.700 0.700" ];
    ITrainingGrain -> IWordsListGrain [ label = "FinishTrainingAsync", color="0.650 0.700 0.700" ];
    ITrainingGrain -> ITrainingGrain_self [ label = "FinishTrainingAsync", color="0.650 0.700 0.700" ];
    ITrainingGrain -> ITrainingGrain_self [ label = "GetTrainingAsync", color="0.650 0.700 0.700" ];
    ITrainingGrain -> ITrainingGrain_self [ label = "CancelTrainingAsync", color="0.650 0.700 0.700" ];

IMembershipTable [color="0.628 0.227 1.000"];
IDeploymentLoadPublisher [color="0.628 0.227 1.000"];
IClusterTypeManager [color="0.628 0.227 1.000"];
ITestHooks [color="0.628 0.227 1.000"];
IUserGrain [color="0.628 0.227 1.000"];
ISessionGrain [color="0.628 0.227 1.000"];
IWordsListGrain [color="0.628 0.227 1.000"];
ITrainingGrain [color="0.628 0.227 1.000"];
}

KSemenenko avatar Feb 11 '22 20:02 KSemenenko

Screenshot 2022-02-11 at 21 59 30

KSemenenko avatar Feb 11 '22 20:02 KSemenenko

Hi, great results so far. But would it not be better to use a graph library where we have more control over the UI behavior and can add things like tooltips and so on?

Also some numbers would be great. Calls per second, error rate and so on.

SebastianStehle avatar Feb 12 '22 15:02 SebastianStehle

https://github.com/cytoscape/cytoscape.js https://github.com/d3/d3

There are a lot of libraries, let’s find the best one

KSemenenko avatar Feb 12 '22 16:02 KSemenenko

But I’m not familiar with front end technologies, so I can’t understand which library is better for it

KSemenenko avatar Feb 12 '22 16:02 KSemenenko

I have worked with this one: https://github.com/visjs/vis-network

SebastianStehle avatar Feb 12 '22 19:02 SebastianStehle

I guess with the dashboard in general but certainly with this proposed feature, this intermixes very well with open telemetry (or telemetry in general). The open PR builds something unique to orleans but the same information can be derived from open telemetry with a much lower overhead cost.

Perhaps instead of introducing something Orleans specific, we can instead consume OT data and use that in the visualization. Not sure yet how that would work, just wanted to have that thrown in here... Will look into this further...

koenbeuk avatar Feb 12 '22 21:02 koenbeuk

The open PR builds something unique to orleans but the same information can be derived from open telemetry with a much lower overhead cost.

Yes, could be. I think Open Telemetry already has some rate limiters built to reduce the number of single spans. If you can derive the needed information from these spans, it could work.

SebastianStehle avatar Feb 14 '22 16:02 SebastianStehle

@SebastianStehle can you guid me please I moved the logic to GrainProfilerFilter.

But I have a question about name formatting, because now we have two Invoke methods with different contexts. And I would like to discuss the data we will be collecting and its formatting. At the moment you're collecting Grain type, method and timing.

I need one more parameter that will indicate the type of the calling Grain. What's the best way to add this? I've been thinking for a couple of days and decided to ask your advice so I don't break the functionality. Also along the way, I thought it was good to do more than just keep track of all the "queues" of requests. but also, for example, to keep track of what a particular Grain is doing in a particular query. To get the whole request path.

For example, to add some attribute, so we could track a specific query to understand that why and where my performance degrades?

I see the base class as:

Grain
Method
Timing
Caller Grain (in the case when it is a chain of requests)
Probably some Id of the request chain.

KSemenenko avatar Feb 16 '22 11:02 KSemenenko

What do you think about this method?

void Track(string trackId, double elapsedMs, Type grainType, [CallerMemberName] string methodName = null, bool failed = false);

When a new request starts, we put new trackId, if we understand that the request chain continues, we use the same trackId. In this way we can get all the requests in the chain. e.g. you start a registration in UserGrain, which creates a session, which also adds a userProfile, and so on. We can see that all these actions are subordinate to one - the first query.

so it doesn't seem like there will be too many changes

KSemenenko avatar Feb 17 '22 06:02 KSemenenko

We just have to introduce a second delegate I think.

SebastianStehle avatar Feb 17 '22 07:02 SebastianStehle

For me it looks like they will do the same thing

KSemenenko avatar Feb 17 '22 07:02 KSemenenko

Yes, but if you introduce another parameter it would be a breaking change, isn't it?

SebastianStehle avatar Feb 17 '22 08:02 SebastianStehle

Yes, but I just don't know who uses this method other than the filter. If there are more consumers, then of course you have to make a new method, and then another storage for the record with traceId.

KSemenenko avatar Feb 17 '22 08:02 KSemenenko

Sorry, I thought you talk about the formatter delegate. I should have read it more carefully.

I think right now we aggregate the method calls per Grain+Method. If we also need the caller then we have to group by this as well.

SebastianStehle avatar Feb 17 '22 10:02 SebastianStehle