cube icon indicating copy to clipboard operation
cube copied to clipboard

Loading database schema from Snowflake fails in some cases

Open brettwatkins-Built opened this issue 2 years ago • 16 comments

Describe the bug After successfully authenticating to Snowflake and running the initial information schema query, Cube returns 'Undefined' for our tables in the Data Model section. I can see initial information schema query successful runs, but I never get the results back into Cube.

To Reproduce Steps to reproduce the behavior:

  1. Connect to a Snowflake instance
  2. Wait for initial response
  3. See error in Data Model

Expected behavior I would expect to see our Snowflake tables

Screenshots Screenshot 2023-11-20 at 3 13 33 PM

Version: v0.34.23

brettwatkins-Built avatar Nov 20 '23 20:11 brettwatkins-Built

having the same issue

snowflake version: 8.21.1 cube version: v0.35.45

zbakkernomi avatar Jun 06 '24 14:06 zbakkernomi

Do we have a solution or an update on this? its been almost 10 months now and no resposne?

christianallred-nomi avatar Aug 16 '24 18:08 christianallred-nomi

Hi @christianallred-nomi 👋 Are you experiencing the same issue? Which Snowflake and Cube versions do you use?

igorlukanin avatar Aug 19 '24 14:08 igorlukanin

@igorlukanin yes i ran into the same issue.

I was using the off the shelf playground, assuming every tag of cubjs/cube gets pushed to latest it would have been cube: cubejs/cube:latest (0.35.70) <- assuming this. snowflake: 8.31.1

from the cube container

{
  "name": "@cubejs-backend/docker",
  "version": "0.35.66",
  "description": "Cube.js In Docker (virtual package)",
  "author": "Cube Dev, Inc.",
  "license": "Apache-2.0",
  "private": true,
  "engines": {
    "node": "^14.0.0 || ^16.0.0 || >=17.0.0"
  },
  "dependencies": {
    "@cubejs-backend/athena-driver": "^0.35.66",
    "@cubejs-backend/bigquery-driver": "^0.35.66",
    "@cubejs-backend/clickhouse-driver": "^0.35.66",
    "@cubejs-backend/crate-driver": "^0.35.66",
    "@cubejs-backend/databricks-jdbc-driver": "^0.35.66",
    "@cubejs-backend/dbt-schema-extension": "^0.35.66",
    "@cubejs-backend/dremio-driver": "^0.35.66",
    "@cubejs-backend/druid-driver": "^0.35.66",
    "@cubejs-backend/duckdb-driver": "^0.35.66",
    "@cubejs-backend/elasticsearch-driver": "^0.35.64",
    "@cubejs-backend/firebolt-driver": "^0.35.66",
    "@cubejs-backend/hive-driver": "^0.35.63",
    "@cubejs-backend/ksql-driver": "^0.35.66",
    "@cubejs-backend/materialize-driver": "^0.35.66",
    "@cubejs-backend/mongobi-driver": "^0.35.64",
    "@cubejs-backend/mssql-driver": "^0.35.63",
    "@cubejs-backend/mysql-driver": "^0.35.66",
    "@cubejs-backend/oracle-driver": "^0.35.63",
    "@cubejs-backend/postgres-driver": "^0.35.66",
    "@cubejs-backend/prestodb-driver": "^0.35.64",
    "@cubejs-backend/questdb-driver": "^0.35.66",
    "@cubejs-backend/redshift-driver": "^0.35.66",
    "@cubejs-backend/server": "^0.35.66",
    "@cubejs-backend/snowflake-driver": "^0.35.63",
    "@cubejs-backend/sqlite-driver": "^0.35.63",
    "@cubejs-backend/trino-driver": "^0.35.66",
    "cubejs-cli": "^0.35.66",
    "typescript": "~5.2.2"
  },
  "resolutions": {
    "colors": "1.4.0"
  }
}

christianallred-nomi avatar Aug 21 '24 19:08 christianallred-nomi

it looks like you have some type of mapping problem. it looks to be getting data from something but the processing appears to fail? im going to dig in more in a bit

Screenshot 2024-08-21 at 1 51 45 PM

christianallred-nomi avatar Aug 21 '24 19:08 christianallred-nomi

i found it.

this code: https://github.com/cube-js/cube/blob/861f13e4265d95015a2d34ebeccd77a3b61fd1c8/packages/cubejs-snowflake-driver/src/SnowflakeDriver.ts#L820-L829

doesn't do what you're hoping, those as "column_name" and such dont actually lowercase the response objects. snowflake headers are coerced the capitols.

this is the response rows that come out of the this.query() in baseDriver

{
  COLUMN_NAME: 'OFFICE_CBSA',
  TABLE_NAME: 'RX_CLAIM_DENORMALIZED',
  TABLE_SCHEMA: 'ARTEMIS_CHILD',
  DATA_TYPE: 'TEXT'
}

so when you got to reduce them in informationColumnsSchemaReducer

and use these let schema = (result[i.table_schema] || {}); it fails

christianallred-nomi avatar Aug 22 '24 17:08 christianallred-nomi

This code change confirmed it. Not sure if this is an option of the snowflake api or not, but its this code that is failing and returning the undefined's

Screenshot 2024-08-22 at 11 25 27 AM

I've also confirmed this works in the playground but can't share that screenshot because i was testing on our dataset.

christianallred-nomi avatar Aug 22 '24 17:08 christianallred-nomi

i think this was a change that snowflake made recently(ish)

https://docs.snowflake.com/en/release-notes/bcr-bundles/2023_07/bcr-881

it looks like snowflake wants you to set a value to control this in your instance. As i dont think the fix for this should be imposing snowflake settings. probably the right route is to override the informationColumnsSchemaReducer inside of the snowflake driver?

Screenshot 2024-08-22 at 11 42 08 AM

looks like the sortBykeys is a private method inside of the base driver though. not sure you'd want to extract that to a util or just copy paste, given two different packages, probably just copy paste.

this does introduce a coupled dependency though.

could update the base driver func to have an option to use captial identifiers, but that feels gross as well.

I'll let yall decide on your patch, but theres your error.

christianallred-nomi avatar Aug 22 '24 17:08 christianallred-nomi

mgiht be able to set this CLIENT_RESULT_COLUMN_CASE_INSENSITIVE on the session possibly.

christianallred-nomi avatar Aug 22 '24 17:08 christianallred-nomi

If you are interested in working on this issue, please go ahead and provide PR for that. We'd be happy to review it and merge it. If this is the first time you are contributing a Pull Request to Cube, please check our contribution guidelines. You can also post any questions while contributing in the #contributors channel in the Cube Slack.

github-actions[bot] avatar Aug 23 '24 10:08 github-actions[bot]

Wow! Thanks for digging so deep into this @christianallred-nomi 👋

Looks like you have nailed it. Do you have any sense who is exposed to this? I've tried to reproduce this with my Snowflake instance and data, it works smoothly with the latest Cube Core.

igorlukanin avatar Aug 23 '24 10:08 igorlukanin

Hi, thank you for the detailed analysis, it looks like we ran into the same issue.

My five cents: To me it appears like those users are affected who have set the Snowflake account parameter QUOTED_IDENTIFIERS_IGNORE_CASE = True for reasons (among others) documented here.

cyl-e avatar Aug 29 '24 07:08 cyl-e

Hi, thank you for the detailed analysis, it looks like we ran into the same issue.

My five cents: To me it appears like those users are affected who have set the Snowflake account parameter QUOTED_IDENTIFIERS_IGNORE_CASE = True for reasons (among others) documented here.

This is what I identified as well.

However I don’t think that this semantic api should have an opinion on sb infrastructure this should be abstracted out In the connector imp.

It is likely that your sf instance has this value set differently.

christianallred avatar Aug 30 '24 00:08 christianallred

I had a PR to fix this locally but didn’t love the solution. I can look at this again next week.

christianallred avatar Aug 30 '24 00:08 christianallred

I've opened a PR for this here: https://github.com/cube-js/cube/pull/8665

christianallred-nomi avatar Sep 03 '24 18:09 christianallred-nomi

I'm having the same issue, so hoping for a fix too @paveltiunov

mattssll avatar Oct 11 '24 16:10 mattssll

CUBEJS_DB_SNOWFLAKE_QUOTED_IDENTIFIERS_IGNORE_CASE is available since v.1.1.17. Please feel free to check it out.

igorlukanin avatar Jan 27 '25 15:01 igorlukanin