feat(ibis): Show full Redshift table metadata by querying system catalogs
The current Redshift metadata query uses information_schema.tables and information_schema.columns, which only shows schemas that are in the user's search_path or owned by the user. This causes visibility issues like https://github.com/Canner/WrenAI/issues/1953
Summary by CodeRabbit
-
Bug Fixes
- More accurate Redshift metadata: corrected nullability and data type reporting; improved table/column comments.
- Excludes system/internal schemas and dropped/system columns; respects namespace privileges.
- Consistent ordering by schema, table, then column; better handling of views alongside tables.
-
Performance
- Faster and more reliable retrieval of Redshift table and column metadata.
Walkthrough
The Redshift metadata query in get_table_list was replaced: it now queries PostgreSQL catalog tables (pg_class/pg_namespace/pg_attribute) and uses format_type/current_database() for types and catalogs, with revised filters, nullability logic, and ordering. External interface and Table/Column construction remain unchanged.
Changes
| Cohort / File(s) | Summary of Changes |
|---|---|
Redshift metadata query rewriteibis-server/app/model/metadata/redshift.py |
Replaced information_schema-based SQL with a catalog-driven query against pg_class/pg_namespace/pg_attribute; selected fields now use current_database()/nspname/relname/attname/format_type; nullability moved to a CASE on attnotnull; filters restrict relkind to 'r' and 'v', exclude system schemas/columns, exclude dropped columns, require USAGE on namespace, and order by schema, table, column position; existing helper transforms and Table/Column construction preserved. |
Sequence Diagram(s)
sequenceDiagram
participant Client
participant Metadata as RedshiftMetadata
participant Conn as SQLConnector
participant Redshift as Redshift
Client->>Metadata: get_table_list()
Metadata->>Conn: execute(catalog-based SQL)
Conn->>Redshift: SELECT from pg_namespace/pg_class/pg_attribute
Redshift-->>Conn: rows (catalog, schema, table, column, type, nullable, comment)
Conn-->>Metadata: result set
Note right of Metadata: Map rows → Table/Column objects<br/>Use helpers: _format_redshift_compact_table_name, _transform_redshift_column_type
Metadata-->>Client: list[Table]
Estimated code review effort
🎯 3 (Moderate) | ⏱️ ~20–30 minutes
- Pay attention to the new SQL correctness and edge cases:
- type extraction via
format_type(...)and its parsing in_transform_redshift_column_type - nullability logic change from information_schema to
attnotnullCASE expression - privilege check (
has_schema_privilege(nsp.oid, 'USAGE')) and schema exclusions - handling of system/dropped columns and ordering semantics
- type extraction via
Poem
I thump the logs and sniff each name,
From pg_class I learn the game.
Columns, types, and schemas bright—
I hop through catalogs all night.
A carrot cheer for metadata’s light. 🥕
Pre-merge checks and finishing touches
❌ Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | ⚠️ Warning | Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. | You can run @coderabbitai generate docstrings to improve docstring coverage. |
âś… Passed checks (2 passed)
| Check name | Status | Explanation |
|---|---|---|
| Description Check | ✅ Passed | Check skipped - CodeRabbit’s high-level summary is enabled. |
| Title check | âś… Passed | The title accurately describes the main change: querying Redshift system catalogs instead of information_schema to show full table metadata, which directly addresses the PR's core objective. |
✨ Finishing touches
- [ ] 📝 Generate docstrings
đź§Ş Generate unit tests (beta)
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
@douenergy, is this PR tested well?
Oracle testing failures are not related to this PR. I have tested it manually on my local environment.