superset icon indicating copy to clipboard operation
superset copied to clipboard

spark sql connection has query issue

Open cometta opened this issue 2 years ago • 2 comments

1 bug) on the sql lab page -> SEE TABLE SCHEMA selectbox throw pop up error when press refresh icon, the schema select box is working fine. the custom query textbox also working fine. see screenshot

image

2 bug) settings-> database connection-> select existing Apache Spark SQL -> click edit basic tab show empty, pop up An error occurred while fetching databases: Object of type bytes is not JSON serializable

image

cometta avatar Nov 23 '22 10:11 cometta

HI @cometta. Which version of the app are you on?

eschutho avatar Nov 23 '22 23:11 eschutho

@eschutho i pull from master branch, run on docker

cometta avatar Nov 24 '22 00:11 cometta

There was an existing issue on this, which I believe has been resolved. Can you pull again from master and test?

eschutho avatar Dec 01 '22 17:12 eschutho

i just pulled a few minutes ago from master , git pull, docker-compose-non-dev.yml pull , docker-compose-non-dev.yml up , still having same error like on the screenshots

cometta avatar Dec 02 '22 01:12 cometta

I don't have spark sql set up, but I'm unable to reproduce on other databases. Can you confirm by testing this on another database to make sure that it's not just happening on spark? If it's still happening with say postgres or even sqlite, then I would try to delete your docker images and pull from docker latest again instead of pulling from github.

If you find that this is only a spark issue, then I'll set up spark and test that specifically.

eschutho avatar Dec 02 '22 22:12 eschutho

it happen for hive connector as well. no issue for databases. i did delete docker volumes

cometta avatar Dec 02 '22 23:12 cometta

I've pulled latest master branch, this bug only happens on spark which used pyhive package.

meyergin avatar Dec 23 '22 12:12 meyergin

it seems the json.dumps method error becase of wrong driver type when using Apache Spark. I modified the code as below and the response is right when click edit databse button

`

method: models.core.Database.driver

def driver(self) -> str:
    if isinstance(self.url_object.get_driver_name(),bytes):
        return bytes(self.url_object.get_driver_name()).decode("utf-8")
    return self.url_object.get_driver_name()

`

xiaotiao avatar Apr 28 '23 09:04 xiaotiao

I just tested this with the pyhive driver and didn't have any errors:

from sqlalchemy.engine.url import make_url
make_url('hive://').get_driver_name() 

returns b'thrift' Is this where the error is happening for you @xiaotiao?

eschutho avatar Apr 28 '23 22:04 eschutho

I just tested this with the pyhive driver and didn't have any errors:

from sqlalchemy.engine.url import make_url
make_url('hive://').get_driver_name() 

returns b'thrift' Is this where the error is happening for you @xiaotiao?

yes, my superset verison is 2.1.0, when click edit database button it will report exception: Object of type bytes is not JSON serializable, and I discovered that json.dumps cannot deserialize bytes data, so I converted the b'thrift' object to a string using the decode() method.

xiaotiao avatar May 14 '23 04:05 xiaotiao

The issue is with the pyhive package installed in your machine. You can resolve it by installing this version of pyhive: pip install git+https://github.com/dropbox/PyHive.git

Udutta52 avatar May 19 '23 20:05 Udutta52

Hopefully, the answer above is sufficient to close this thread. I'll do so anyway since it's been so long since there was a comment here. If this needs revisiting/reopening, just say the word.

rusackas avatar Mar 04 '24 18:03 rusackas