brave icon indicating copy to clipboard operation
brave copied to clipboard

consider datasource-proxy as an alternative or replacement to p6spy

Open codefromthecrypt opened this issue 5 years ago • 27 comments

We currently have JDBC instrumentation via https://github.com/p6spy/p6spy. This has been helpful, but we have also received feedback it is harder to configure, so some don't use it. Given limited resources, we should think through carefully what to continue supporting and not.

There's an alternate library available that we didn't know at the time https://github.com/ttddyy/datasource-proxy It's 1.x codeline is compatibly with JRE 1.6 which means it could be used with all versions of code Brave supports. It also has a 2.x codeline and external adapters in progress, for example, https://github.com/ttddyy/datasource-proxy-r2dbc @ttddyy is active as well in these communities.

I'd like to get feedback from those who've tried either so we can chart a path forward. cc'ing some folks but feedback open to all:

Here are some users: @shakuzen @llinder @hypnoce @ewhauser @NegatioN @Sef135 @regardfs @Logic-32 @jorgheymans @nklmish @dmitry-at-genestack @anuraaga @koji @theopengroup @tangrui @gcnote @whx4J8 @dhwajad @elventear @suls @embs @mmanciop @Jaware

And here are some folks who write alternatives to brave and often give feedback: @felixbarny @tylerbenson @nicmunroe @wu-sheng @ivantopo

codefromthecrypt avatar Nov 08 '18 00:11 codefromthecrypt

Also, while the title includes a decision about whether to also carry-forward p6spy, interest in datasource-proxy is independent.

If you just want to express you want datasource-proxy support, put a thumbs up on the PR description screen shot 2018-11-08 at 9 03 33 am

codefromthecrypt avatar Nov 08 '18 01:11 codefromthecrypt

Thanks @adriancole

I'd be very happy to hear about feedback.

For datasource-proxy 1.x (current release version), it's a bit aged programming style since it's been there for long time. I'm also writing one for r2dbc, and based on that, I'm putting the experience back to 2.x branch(java8 baseline).

Based on the feedback, I'll update datasource-proxy as well :)

ttddyy avatar Nov 08 '18 01:11 ttddyy

I've added a feature request to allow your 2.x codeline to be 1.6 bytecode. We can discuss there, and crazy as it sounds, we still have folks running old runtimes at least Java 1.7 but sometimes even 1.6. https://github.com/ttddyy/datasource-proxy/issues/55

codefromthecrypt avatar Nov 08 '18 01:11 codefromthecrypt

This would be super helpful. Currently we also faced some difficulties with p6spy.

zeagord avatar Nov 08 '18 07:11 zeagord

If we have already known datasource-proxy is better. You should encourage users to choose it. Also, I think JDK 1.6 compatible is important if it is still possible.

wu-sheng avatar Nov 08 '18 12:11 wu-sheng

Just wondering specifically which issues people were having with the p6spy integration ?

jorgheymans avatar Nov 08 '18 14:11 jorgheymans

FYI:

@gavlyukovskiy has a project : spring-boot-data-source-decorator On the bottom of README, there are some picture with tracing.

In that project, there is a datasource-proxy listener implementation that creates spans. TracingQueryExecutionListener.java

ttddyy avatar Nov 08 '18 18:11 ttddyy

More than that I even have version that works with brave - TracingQueryExecutionListener It's more or less the same as brave's implementation, but also creates spans for Connection lifetime and for time between fetching first row and closing ResultSet. I would be happy to move this listener to brave and just leave spring auto-configuration on my side.

gavlyukovskiy avatar Nov 08 '18 19:11 gavlyukovskiy

So there was a question about feedback and things leading to this issue. I won't mention routine glitches and things. I will answer other comments independently.

I'm not sure I can find all the feedback on gitter, but I sensed configuration has been tricky. I placed some of this below.

I think one of the major things that "made me look" was that @gavlyukovskiy implemented datasource-proxy alongside p6spy https://github.com/gavlyukovskiy/spring-boot-data-source-decorator. I noticed the api style differences, where datasource-proxy seemed to be configured with java types moreso than p6spy. I could follow it easier, and this seems more natural in java based DI tools.

  • https://github.com/gavlyukovskiy/spring-boot-data-source-decorator/blob/master/datasource-decorator-spring-boot-autoconfigure/src/main/java/com/github/gavlyukovskiy/boot/jdbc/decorator/p6spy/P6SpyConfiguration.java
  • https://github.com/gavlyukovskiy/spring-boot-data-source-decorator/blob/master/datasource-decorator-spring-boot-autoconfigure/src/main/java/com/github/gavlyukovskiy/boot/jdbc/decorator/dsproxy/ProxyDataSourceBuilderConfigurer.java

This is subjective, and I'm not a user either!, but I have to say that p6spy and also mysql style configuration feels foreign as they tend to nest details in their own configuration such as separate files or embedded strings. It feels nicer to allow frameworks like spring boot or dropwizard to participate directly cc also @jplock

I could dig up more on various gitter channels, but there have been some interesting feedback on p6spy that isn't about routine api change etc. We don't know how well it would apply to datasource-proxy either! I haven't collected all of it, just looked on one channel and grabbed a few.

@jorgheymans Oct 30 2017 20:58

so for my tracing needs i was looking to filter out some of the spans p6spy is generating by configuring it's executionThreshold property. Turns out this is not applied to custom JdbcEventListeners (which Brave uses), bummer. logged p6spy/p6spy#418 in case anyone is interested i'm fully aware that even if they would expose their filter logic, most notably the executionThreshold feature would be not trivial for brave to implement. But at least we would have some options

@hypnoce Sep 11 2017 07:26

Hi @adriancole , I was trying to have multiple remoteServiceName for the p6spy module factory. When no name is defined, it takes the connection catalog. I have many different DB connection for the same service, so either I have a pretty functionally irrelevant db catalog name or the same remote service name for all DBs Do you think it's possible to add a property (zipkinRemoteServiceName) to the url connection ? jdbc:mysql://localhost:5555/mydatabase?user=user& zipkinRemoteServiceName=authenticationDB let me know if you have any thoughts on that. Thanks !

codefromthecrypt avatar Nov 09 '18 00:11 codefromthecrypt

@gavlyukovskiy and others.. one thing I would like to have is a clear path for folks to decorate our default database policy. For example, sometimes people want more data than is cheap to add. We have a lot of sites who want very tight client spans. I feel like datasource-proxy has some layering that would work nicely in spring boot, for example a list of interceptors.

This isn't to say p6spy doesn't, but it is definitely the case I'd like something simple that works for modern things like boot and also older apps like spring 3 XML, and use exactly the same code in both places.

Right now, we usually only have mysql examples as it is simpler to explain than p6spy (no offence, but it would at least always be simpler because of the lack of abstraction).

codefromthecrypt avatar Nov 09 '18 00:11 codefromthecrypt

Regarding above ProxyDataSourceBuilderConfigurer.java, on datasource-proxy 2.x version I'm planning to remove logger specific APIs. (logQueryBySlf4j(), logQueryByJUL(), logQueryByCommons(), logQueryToSysOut())

Instead, allowing user to simply define custom behavior via callback.

ExecutionInfoFormatter formatter = ExecutionInfoFormatter.showAll();  // formatter ,converter
Logger logger = ...  // create own logger

DataSource proxyDs = new ProxyDataSourceBuilder(ds)
  .afterQuery(execInfo -> {
    String log = formatter.format(execInfo);  // convert ExecutionInfo to String
    logger.info(log);  // write out to own logger, sysout, etc
    // you can define more behavior
  })
  .build();

This was one of the my experience from datasource-proxy-r2dbc

Actually, this is already possible in 1.x branch(doc), just formatter(ExecutionInfo -> String) is not there.

ttddyy avatar Nov 09 '18 00:11 ttddyy

It's generally annoying that JDBC has failed to expose any hooks for observability unlike database runtimes in other languages. I tried a number of datasource proxies when I wrote the p6spy integration but I can't really remember whether I tried datasource-proxy or not. In general, both seem like reasonable solutions.

It sounds like the main issues are:

  1. The documentation should be improved (i.e. here's how you can configure this without a properties file)
  2. An example of configuring this in Spring Boot
  3. An example of configuring this with Spring XML

I can certainly address these three things if that's what is needed. Other suggestion would be to have people upon tickets when they have a feature request; I don't watch Gitter so didn't realize there was even any interest in improvements.

I would keep in mind that not everyone uses Spring (we certainly didn't when I wrote this) and some people will want tracing to be enabled on JDBC connections that they don't get to instantiate in their code. The documentation I wrote revolved around configuration via the JDBC string because that is what we needed to do.

All that being said, I'm in no way religious about the approach and if there is something better out there than great.

ewhauser avatar Nov 09 '18 01:11 ewhauser

@ewhauser thanks for piping in, especially highlighting the "no dependency injection" story as I presume that's what you meant by not spring (ex we do sometimes get feedback on dagger by folks or guice via DW). If you meant DI, but not spring, let us know so that we can create the right example. For example, there's a brave-webmvc-example which I plan to do JDBC examples in regardless.

On the other point, I think quite a large number of issues were raised in p6spy and many of them addressed. Well said, though.. some issues could be tracked better also here. I think I was probably too late coming to the realisation that people walking around things by using other tools was symptomatic of either us needing to change our approach, support another tool, or switch tools.

codefromthecrypt avatar Nov 09 '18 01:11 codefromthecrypt

One feedback I would give (ironically from a project that is wordplay on the word dapper :P) is that p6spy is a terrible name. No one knows it has anything to do with databases, and this causes support questions possibly leaving people to look elsewhere. We can't realistically do this experiment, but it would be interesting to know how many would have struggled if the library was named datasource-proxy :P

codefromthecrypt avatar Nov 09 '18 01:11 codefromthecrypt

to pre-empt the question, we use the same names of the libraries in our artifact names, so brave-instrumentation-datasource-proxy or brave-instrumentation-p6spy. We do this because the primary concern is instrumenting the library, that's also where all the classpath "bless" is around etc. I think we considered breaking this convention only for p6spy as it is so hard for people to know what it is.

codefromthecrypt avatar Nov 09 '18 01:11 codefromthecrypt

Well, it was originally called brave-jdbc which was pretty descriptive but a certain someone made me change it =)

ewhauser avatar Nov 09 '18 01:11 ewhauser

brave-jdbc sounds like an awesome name to me simply because it's what we named our custom variant of this problem in-house.

Basically, my two cents on the problem as a whole:

  1. We rolled our own datasource-proxy-like implementation in house because we didn't realize it or p6spy existed at the time
  2. We avoid Spring like the plague but use Guice/HK2/Dagger for DI frameworks (depending on which team you're on)
  3. Instrumenting the DataSource directly, and returning a DataSource is great because they we can use Hikari/Hibernate/plain-ol-JDBC and get instrumentation no matter what

Briefly looking at datasource-proxy, it looks pretty cool. I don't have the time to dig in but our in-house requirements for it to be a viable replacement would be:

  • The ability to create one span for the life of a Connection
  • Observe when multiple queries are executed on the same Connection
  • Get the actual SQL statement being executed
  • Distinguish between a batch execution and a regular one
  • Know when we get the first/last row (at a minimum)

We will probably evaluate datasource-proxy in more detail at a (much) later date but until then, if you feel like it can meet those goals then I'm happy with it :)

Logic-32 avatar Nov 12 '18 21:11 Logic-32

//cc @typekpb @quintonm

felixbarny avatar Nov 13 '18 07:11 felixbarny

FYI I've heard several people want to name something brave-instrumentation-jdbc (not just originally @ewhauser which I admit to resisting). If we do formalize something with such a name we are essentially "picking a winner" in so far as it would almost certainly be incompatible to use whatever that code is with another driver. That's because like mentioned before JDBC has no hooks.

I'm highly resistent because whichever library it is, it will eventually have an incompatible change. Usually we can get around incompatible change by refering to the library version that is now changed. If we say brave-instrumentation-jdbc IOTW, yes, it does make things more visible, but OTOH, it locks us into an awkward different explanation for how to treat this differently from everything else. This will hurt eventually.

That all said, if folks knowing this want this to be the case, I am ok defending questions about it later. If so, upvote the comment with thumbsup or reply with alternate suggestion.

Meanwhile, what is that library to implement brave JDBC is still under debate. We've had a history of folks using p6spy, but not enough advocates for it speaking up yet. We should hear whoever they are, as currently people seem in favor of datasource-proxy, if anything, and that might just be due to who is replying.

codefromthecrypt avatar Nov 13 '18 07:11 codefromthecrypt

I'm a -1 on the name brave-instrumentation-jdbc, if there is going to be a module that is named that I think it should be an abstraction module like brave-instrumentation-http

I do not have skin in the game on the library choice front, but this is my opinion from a code organization point of view

devinsba avatar Nov 13 '18 16:11 devinsba

I won't comment on the library choice as I'm one of the p6spy maintainers. Anyone willing to raise separate issues on p6spy for the problems identified? Regardless on your choice we could implement what our users miss.

On Tue, Nov 13, 2018, 17:07 Brian Devins-Suresh <[email protected] wrote:

I'm a -1 on the name brave-instrumentation-jdbc, if there is going to be a module that is named that I think it should be an abstraction module like brave-instrumentation-http

I do not have skin in the game on the library choice front, but this is my opinion from a code organization point of view

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openzipkin/brave/issues/824#issuecomment-438322260, or mute the thread https://github.com/notifications/unsubscribe-auth/AAegbF2G0oYZrbPSE9uJDUsOoxMb-WeRks5uuu4zgaJpZM4YTyvp .

typekpb avatar Nov 13 '18 16:11 typekpb

If the main problem to to solve on naming is discoverability, I'd probably opt for brave-instrumentation-jdbc-library. Although more verbose, it has the nice qualities of being discoverable and not picking a winner.

ewhauser avatar Nov 13 '18 16:11 ewhauser

If the main problem to to solve on naming is discoverability, I'd probably opt for brave-instrumentation-jdbc-library. Although more verbose, it has the nice qualities of being discoverable and not picking a winner.

I think the winner problem is that this would be implemented by something right? In such a way it would be p6spy or datasource-proxy unless we do what devinsba said which is to make an abstraction like we have for http (which has no dependency except jdbc and is implemented by one of the libraries)

Make sense?

codefromthecrypt avatar Nov 14 '18 00:11 codefromthecrypt

I have possibly a terrible idea to solve the discovery problem.

we make brave-instrumentation-jdbc we publish it to maven central. inside that jar is a copy of a README that says we have no abstraction yet. the current supported tracing things are p6spy and datasource-proxy

no winner choosing. no premature abstractions to haunt us. little maintenance. yes some maven central abuse

codefromthecrypt avatar Nov 14 '18 00:11 codefromthecrypt

PS we will eventually need brave-instrumentation-jdbc just like we currently need brave-instrumentation-servlet. For example, frameworks that layer on servlet apis, like spark and spring mvc need that jar. We will eventually need to create brave-instrumentation-jdbc as a jar even if only for support reasons.

That's why I don't think doing brave-instrumentation-jdbc with a README is long term debt, rather a placeholder until we need something which we will almost certainly need.

codefromthecrypt avatar Nov 14 '18 01:11 codefromthecrypt

As a user of both I would say that p6spy has better api to create listeners and I'd chose p6spy if I need to create custom ones, but configuring it is not an easy part. Though now with JdbcEventListenerFactory you can programmatically register listeners in the same way as you do in datasource-proxy. I would like to see brave supporting both, maybe using common listener interface as I did, and give users a choice. Having brave-instrumentation-jdbc as umbrella for two implementations with explanation how to configure p6spy or datasource-proxy sounds right for me.

gavlyukovskiy avatar Nov 14 '18 02:11 gavlyukovskiy

lovely! https://github.com/ttddyy/datasource-proxy-r2dbc#distributed-tracing

codefromthecrypt avatar Nov 24 '18 11:11 codefromthecrypt