langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Add Spark SQL support

Open gengliangwang opened this issue 2 years ago • 5 comments
trafficstars

Add Spark SQL support

  • Add Spark SQL support. It can connect to Spark via building a local/remote SparkSession.
  • Include a notebook example

I tried some complicated queries (window function, table joins), and the tool works well. Compared to the Spark Dataframe agent, this tool is able to generate queries across multiple tables.

gengliangwang avatar May 12 '23 22:05 gengliangwang

Note: There was an approach based on SQLDatabase. But @dev2049 suggests not inheriting from SQLDatabase. https://github.com/hwchase17/langchain/pull/4381

gengliangwang avatar May 12 '23 22:05 gengliangwang

@skcoirz Thanks for help updating this one!

gengliangwang avatar May 13 '23 04:05 gengliangwang

@skcoirz Thanks for help updating this one!

yeah, sure thing! I tested this. The new query checker is really powerful! It solved the previous concern of AnalysisException. Thank you so much for adding this! During the test, I noticed a few more opportunities. I have added them to our spreadsheet. Happy to chat more when you have time! Have a good weekend! :D

skcoirz avatar May 13 '23 04:05 skcoirz

Moved the rest new features to a new PR on top of this branch. (https://github.com/hwchase17/langchain/pull/4672)

skcoirz avatar May 14 '23 17:05 skcoirz

cc @vowelparrot @hwchase17 could you review this one? The new agent is helpful for the Apache Spark community.

gengliangwang avatar May 15 '23 19:05 gengliangwang

I just did a final check before merging. There is a bug in the memory support. I reverted it to make this first version simple and robust. Discuss with @skcoirz offline and he will create another PR for general support for Agents. I also verified by rerunning the notebook. It works great.

gengliangwang avatar May 18 '23 23:05 gengliangwang

@hwchase17 @dev2049 @skcoirz Thanks for reviewing this!

gengliangwang avatar May 19 '23 05:05 gengliangwang