zipkin-sparkstreaming icon indicating copy to clipboard operation
zipkin-sparkstreaming copied to clipboard

Investigate vizceral output

Open codefromthecrypt opened this issue 8 years ago • 13 comments

From @adriancole on November 22, 2016 5:43

Vizceral is a streaming aggregation tool similar to the service dependency graph we have now (but more pretty and powerful).

https://github.com/Netflix/vizceral

There've been a few look into this. I played around with it a bit, toying with a custom version of our dependency graph linker or using jq to translate stored traces. I also thought maybe this could be done online with a custom kafka or elasticsearch 5 pipeline. Or maybe something in-between, like a 1 minute interval hack of our spark job. There's also rumors of a spark alternative to normal zipkin collector (ahem @mansu ahem :) )

Here is a summary of notes last time this came up, when I chatted with @tramchamploo

so just to play with things you could use dependency linker like the zipkin-dependencies job does, except windowed into minute, not days. Use the zipkin api and GET /api/v1/traces with a timestamp and lookback of your choosing (ex 1 minute). With a custom linker, you can emit vizceral data directly or into a new index for the experiment, like zipkin-vizceral-yyyy-MM-dd-HHmm. In other words, it is like the existing spark job, but writing vizceral format and much more frequently.

To dig deeper, you'd want to some "partition" vs a "grouping" command like a groupBy, in order to group the traces into minutes.. so like before this flatMap here: https://github.com/openzipkin/zipkin-dependencies/blob/master/elasticsearch/src/main/java/zipkin/dependencies/elasticsearch/ElasticsearchDependenciesJob.java#L116 This would be the thing that buckets traces into epoch minutes.

In order to get the service relationships, you need to walk the trace tree. To generate the tree you need to merge multiple documents (which consitute a trace), to tell which pieces are a client or server call. This is what the DependencyLinker does.

So basically, by bucketing offline data into 1 minute intervals (based on the root span's timestamp), you can get pretty good feedback. It will be mostly correct as traces are less duration than a minute. By using the api and a variation of our linker, you'd get a good head start which can of course be refactored later if/when a real-time ingestion pipeline exists.

Copied from original issue: openzipkin/zipkin#1416

codefromthecrypt avatar Feb 07 '17 08:02 codefromthecrypt

From @mansu on November 22, 2016 6:28

@adriancole We used the open source spark job to get a dependency graph and visualized it with vizceral. It was a very good visualization. But the data quality can be improved. We will try to open source this also.

codefromthecrypt avatar Feb 07 '17 08:02 codefromthecrypt

From @devinsba on November 22, 2016 15:22

I'm very interested in this. I was hoping to be able to do some of this using SQS/SNS/Kinesis for reading the data in realtime instead of doing windows from the database

codefromthecrypt avatar Feb 07 '17 08:02 codefromthecrypt

From @tramchamploo on December 26, 2016 7:36

Working on integration between vizceral and zipkin. I've added a new api like /vizceral which simply query all traces in the last few seconds and use DependencyLinker to link them. Tried to set the limit to Integer.MAX_VALUE in order to retrieve traces as much as possible to get a full dependency graph. But ES won't allow that, throwing

org.elasticsearch.common.io.stream.NotSerializableExceptionWrapper: too_many_clauses: maxClauseCount is set to 1024
    at org.apache.lucene.search.BooleanQuery$Builder.add(BooleanQuery.java:137) ~[lucene-core-5.5.2.jar!/:5.5.2 8e5d40b22a3968df065dfc078ef81cbb031f0e4a - sarowe - 2016-06-21 11:38:23]
    at org.elasticsearch.index.query.TermsQueryParser.parse(TermsQueryParser.java:200) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:250) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.index.query.IndexQueryParserService.innerParse(IndexQueryParserService.java:320) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:223) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:218) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.search.query.QueryParseElement.parse(QueryParseElement.java:33) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.search.SearchService.parseSource(SearchService.java:856) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.search.SearchService.createContext(SearchService.java:667) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:633) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:377) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:368) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.search.action.SearchServiceTransportAction$SearchQueryTransportHandler.messageReceived(SearchServiceTransportAction.java:365) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:33) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:77) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.doRun(MessageChannelHandler.java:293) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-2.4.1.jar!/:2.4.1]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.8.0_91]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.8.0_91]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]

Anyone have any ideas?

codefromthecrypt avatar Feb 07 '17 08:02 codefromthecrypt

the code @tramchamploo mentions that limits the result count in elasticsearch to 1024 is here https://github.com/openzipkin/zipkin/blob/master/zipkin-storage/elasticsearch/src/main/java/zipkin/storage/elasticsearch/ElasticsearchSpanStore.java#L140

codefromthecrypt avatar Feb 07 '17 08:02 codefromthecrypt

From @mansu on December 26, 2016 10:42

/cc @naoman

codefromthecrypt avatar Feb 07 '17 08:02 codefromthecrypt

From @hvandenb on February 6, 2017 15:39

Anyone work on this integration and made it available? There is an example project that shows how this was done with Hystrix. https://github.com/OskarKjellin/vizceral-hystrix

codefromthecrypt avatar Feb 07 '17 08:02 codefromthecrypt

@tramchamploo ps can you check with latest zipkin 1.20+ There's a query-limit relating change there? (Our tests now look for 1000 dependency links.. maybe I can change that to 1025 :) ) cc @lijunyong

codefromthecrypt avatar Feb 07 '17 08:02 codefromthecrypt

@hvandenb so far I've seen nothing open sourced on this topic, rather a few people experimenting on their own.

@tramchamploo @naoman, @mansu do any of you have code to share around vizceral? I know that has been a hot topic and something clearly wanted (even if the impl is imperfect due to limited data in zipkin).

By sharing with others, someone might be able to progress this as it has been stuck for over 3 months now. This could be as simple as a gist or code paste.

codefromthecrypt avatar Feb 07 '17 08:02 codefromthecrypt

From @naoman on February 7, 2017 5:20

Sorry for the delay guys. We were busy with open sourcing our (Pinterest) Spark collector for Zipkin and couldn't share much here.

As part of a hackathon, we were able to modify this spark collector to build service dependency graph on the streaming data and push the graph to Vizceral, in real-time. I'll clean up the code and share it here, hopefully in next couple of weeks.

codefromthecrypt avatar Feb 07 '17 08:02 codefromthecrypt

@adriancole , was wondering if there are any updates on this. Seems pretty cool.

vgurikar avatar Sep 01 '17 21:09 vgurikar

This repo is unfortunately lacking engagement or progress. It might end up in the attic if this continues. Probably a more specific project just for vizceral may have more success. For example, you can emit to vizceral directly and it could be more popular to do that.

codefromthecrypt avatar Sep 02 '17 02:09 codefromthecrypt

Very interested in this topic. I guess no progress has been made in this regard. Anyone has shared some code? I would like to try to carry out a POC in the following days. Wondering if would be better to try it with zipkin-dependencies or zipkin-sparkstreaming. Any idea on this?

xoanteis avatar Mar 07 '18 15:03 xoanteis

Since I posted it in gitter, I will post it here too:

https://gist.github.com/devinsba/32bf8e1da56a5e368f1d697dfb3b6dd5

This is the prototype I built based on the kinesis collector. It outputs json on it's one endpoint in the shape the UI likes

devinsba avatar Mar 08 '18 16:03 devinsba