cube icon indicating copy to clipboard operation
cube copied to clipboard

SQL API executes slowly with long WHERE clauses

Open w1992wishes opened this issue 3 months ago • 5 comments

Question Excuse me, I'd like to ask a question. Currently, I've observed that if the SQL query statement has a very long WHERE condition, the execution time is very long. The corresponding REST API takes 20ms, but the SQL API takes 1.5s. When I checked the logs, I found that the execution time was mainly spent in the "SQL API Query Planning Success" log, which occurred twice with one of them taking 550ms. Are there any optimization measures for this issue?

SQL

select
	total_amount,
	status
from
	Orders
where
	status in (
	'processing',
	'processing1', 'processing2', 'processing3', 'processing4', 'processing5', 
	'processing6', 'processing7', 'processing8', 'processing9', 'processing10', 
		'processing11', 'processing12', 'processing13', 'processing14', 'processing15', 
	'processing16', 'processing17', 'processing18', 'processing19', 'processing20', 
	'processing21', 'processing22', 'processing23', 'processing24', 'processing25', 
	'processing26', 'processing27', 'processing28', 'processing29', 'processing30', 
		'processing31', 'processing32', 'processing33', 'processing34', 'processing35', 
	'processing36', 'processing37', 'processing38', 'processing39', 'processing40', 
		'processing41', 'processing42', 'processing43', 'processing44', 'processing45', 
	'processing46', 'processing47', 'processing48', 'processing49', 'processing10'
	)

image

w1992wishes avatar Apr 10 '24 10:04 w1992wishes

Hi @w1992wishes 👋

Thanks for reporting this! I would say that this is more or less expected given the implementation of the SQL API. Analyzing SQL queries is quite a non-trivial thing.

Does anything change for you if you set CUBESQL_SQL_PUSH_DOWN to true and switch to the latest version of Cube?

igorlukanin avatar Apr 10 '24 14:04 igorlukanin

Hi @w1992wishes 👋

Thanks for reporting this! I would say that this is more or less expected given the implementation of the SQL API. Analyzing SQL queries is quite a non-trivial thing.

Does anything change for you if you set CUBESQL_SQL_PUSH_DOWN to true and switch to the latest version of Cube?

Thank you. I tried it out locally with version 0.35.10, but didn't see any improvement. When setting CUBESQL_SQL_PUSH_DOWN = false, it's still about 30% faster. Are there any plans for further optimization in the future?

w1992wishes avatar Apr 11 '24 02:04 w1992wishes

There's some work-in-progress on the SQL API performance and it's priority but I also have to admit that it would probably always be slower that the REST API, even a fraction, because there's a complicated SQL query analysis step involved.

igorlukanin avatar Apr 11 '24 11:04 igorlukanin

@igorlukanin 👋 Sorry to bother you again

we're quite concerned about this issue. Could you please inform us when the updates for the SQL API are expected? How much performance improvement can we expect, and will there be updates involving datafusion?

w1992wishes avatar Apr 19 '24 03:04 w1992wishes

@w1992wishes There's no particular timeline that I can share. However, the team is actively working on performance improvements to the SQL API. You can expect some updates in the coming weeks/months.

You can tune in to updates via the following channels:

  • Releases on GitHub (https://github.com/cube-js/cube/releases)
  • #announcements channel on Slack (https://slack.cube.dev)
  • Cube's blog (https://cube.dev/blog), including the Changelog section (https://cube.dev/blog/category/changelog)

How much performance improvement can we expect I'm not sure if I can guarantee any degree of improvement. I believe it's gonna be something that you can observe.

will there be updates involving datafusion? Not sure what do you mean by that, could you please expand?

igorlukanin avatar Apr 20 '24 22:04 igorlukanin