skywalking [Feature] A lightweight and APM-oriented SQL parser module

trafficstars

Search before asking

[X] I had searched in the issues and found no similar feature requirement.

Description

For years, we have supported database monitoring from traces, including metrics analysis and slow SQLs sampling. But still, the metrics are very general.

We can't tell the hot tables and read/write loads like we recently added for visual cache analysis, https://github.com/apache/skywalking/pull/9622

We tried to use Apache ShardingSphere to parse SQL to get the key information such as table names, operation(select/update/insert/delete/DDL), and key conditions, it is possible from the tech perspective, but not fast enough as SkyWalking OAP should handle 1m+ spans/s at least.

Here, after months of thinking, I want to propose that SkyWalking begins to provide a new module to implement SQL parser by ourselves. Like all works we did, there is no intent to compete with ShardingSphere SQL parser but orient APM scenarios.

What does APM-oriented mean?

We should not try to build a complete SQL grammar tree like a SQL database, because generally, we just want to fetch important information from to SQLs, and build read/write metrics, tables in SQLs(but not columns), key conditions(such as table join conditions).

What tech stack should we choose?

Antlr is still our first class choice, as

It is stable, and widely used for years
SkyWalking OAL engine is built on the top of it, we could have a consistent tech stack.

We should only consider building our own grammar tree analysis stack when antlr can't provide enough performance(highly not likely)

About SQL analysis

There are some antlr grammar definitions for SQLs in ShardingSphere and Antlr repositories, and also out there from various blogs. But we should build our own rather than copy/paste from them(not about the license). We need a simplified grammar tree for SQL(SQL92/MySQL/PostgreSQL, etc.), to only fetch important information

Tables
Join conditions
Operation type, DML(select/insert/update/delete) or DDL(a general type, not a performance concern).
Hard code conditions rather than preparedStatement detecting

There is a similar feature from uptrace SQL parser. Their footprint idea is to fetch important information too.

Use case

No response

Related issues

Last time PoC to do SQL parser through ShardingSphere kernel

https://github.com/apache/skywalking/issues/5838
https://github.com/apache/shardingsphere/issues/8208

Are you willing to submit a PR?

[ ] Yes I am willing to submit a PR!

Code of Conduct

[X] I agree to follow this project's Code of Conduct

Sep 22 '22 00:09 wu-sheng

I shared this idea with @Superskyyy and @yswdqz They both think they have interests to do this later.

Let's keep open discussion here, especially about perf testing and SQL key information fetch and build, we need more discussions before providing this officially.

Sep 22 '22 00:09 wu-sheng

About integrating this module into the agent-analyzer and adding new metrics/dashboards, we could wait for this module is ready.

Sep 22 '22 00:09 wu-sheng

@yswdqz As you are starting this, let's try to build a basic anlt4 grammar file for SQLs to get fingerprint information, and test its performance(using JMH) first. Don't go too far about how OAP should work and which metrics could be generated.

Sep 29 '22 00:09 wu-sheng

@yswdqz As you are starting this, let's try to build a basic anlt4 grammar file for SQLs to get fingerprint information, and test its performance(using JMH) first. Don't go too far about how OAP should work and which metrics could be generated.

Got it.

Sep 29 '22 00:09 yswdqz

@yswdqz Any update? Are you back to this?

Jan 10 '23 10:01 wu-sheng

@yswdqz Any update? Are you back to this?

Yes, I have wrote a demo, but I'm not satisfied with the performance, I'm trying to adjust it now.

Jan 10 '23 11:01 yswdqz

Do you have ideas to go further? If you need any help, let me know.

Jan 10 '23 11:01 wu-sheng

Do you have ideas to go further?

I am reading some blogs about how to enhance performance, and I am trying to optimize the rule of parser.

If you need any help, let me know.

I don't know how to locate which is slow, is there any third-party software that can count the run time of each function？

Jan 10 '23 11:01 yswdqz

Yes, there are many Java profiling tool, like JProfiler. It could measure all methods with stack context.

Jan 10 '23 11:01 wu-sheng

Yes, there are many Java profiling tool, like JProfiler. It could measure all methods with stack context.

Got it , I will take a try.

Jan 10 '23 11:01 yswdqz

I have tested many ways, I'm sorry to say that for the current performance, it is difficult to meet APM's parsing needs for sql.

I've thought about optimizing performance by simplifying parser rules, skipping unneeded parts, and using antlr4 actions. But there is no order of magnitude improvement.

This is the performance report of my parser : performance report

This is the design report of sql-sign: sql-sign

This is the repository of my parser: sql-parser Notice: it has 2 modes: with action and with visitor. With action mode can parse most dml statement. But visitor mode I only implement select and delete statement (because of after finish these, I already found that the problem of performance, so I have not completed it)

To test them in APM , we can use test file. It should be put in server-starter module's test folder. And change VirtualDatabaseProcessor#prepareVSIfNecessary as VirtualDatabaseProcessor.java . Then you can run benchmark to test the performance changes before and after code changes.

Finally, as mentioned in the performance report, The main performance problems are in the lexer analysis and parser analysis sections. So it's likely that future performance improvements will be in the parser and lexer. It might be a good idea not to use antlr4 (or other util) to implement the parser, but that would be too much work. In the future I will focus on solutions from other open source frameworks and try to solve this performance problem.

Jan 16 '23 09:01 yswdqz

Thanks for your summary. Let's suspend from here, until we have a better way, which costs a reasonable resources for analyzing this.

Jan 16 '23 09:01 wu-sheng

skywalking skywalking copied to clipboard

[Feature] A lightweight and APM-oriented SQL parser module

Search before asking

Description

What does APM-oriented mean?

What tech stack should we choose?

About SQL analysis

Use case

Related issues

Are you willing to submit a PR?

Code of Conduct

skywalking
skywalking copied to clipboard