databricks-sql-nodejs
databricks-sql-nodejs copied to clipboard
Aliasing columns in a query results in the query's result missing data
- [x] Check existing issues for a duplicate of this bug
Summary
This query returns the expected data,
SELECT carat as a, color as b FROM default.diamonds LIMIT 2;
-- Result
┌─────────┬────────┬─────┐
│ (index) │ a │ b │
├─────────┼────────┼─────┤
│ 0 │ '0.23' │ 'E' │
│ 1 │ '0.21' │ 'E' │
└─────────┴────────┴─────┘
Whereas this query returns results missing data,
SELECT carat as a, color as a FROM default.diamonds LIMIT 2;
-- Result
┌─────────┬─────┐
│ (index) │ a │
├─────────┼─────┤
│ 0 │ 'E' │
│ 1 │ 'E' │
└─────────┴─────┘
Is it possible to handle this scenario properly so we get the right data for such queries?
Reproduction
You'll find a minimal and complete reproduction example here that you can run yourself https://github.com/varun-dc/databricks-nodejs-duplicate-column-select-bug-reproduction
Hi @varun-dc! Yes, we know that such behavior exists, as well as in other connectors (which also don't have any special handling for duplicated columns, so their behavior slightly differs). We're trying to find a good solution for this issue which will work across all connectors, but meanwhile can only suggest you to avoid diplucated column names in your queries. Sorry for the inconvenience
P.S. I'll definitely keep this issue open so we can continue discussion here and post updates. Also, if you have any ideas - feel free to share. Thank you!
I've seen this problem before in drivers for other databases and the common solution is the implementation of a rowMode option, for example like in node's pg - https://node-postgres.com/features/queries#row-mode.
Internal ticket: PECO-970 Probably will start working on this very soon