`dbListObjects()` does not list empty schemata
dbListObjects() does not list empty schemata, i.e. a schema without a table.
library(RPostgres)
con <- postgresDefault()
dbExecute(con, paste0('CREATE SCHEMA "Robert"'))
#> [1] 0
dbListObjects(con)
#> table is_prefix
#> 1 <Id> schema = information_schema TRUE
#> 2 <Id> schema = pg_catalog TRUE
The schema is listed in information_schema.schemata or pg_catalog.pg_namespace (the (empty) public schema is also not listed by dbListObjects() BTW)
dbGetQuery(con, 'SELECT schema_name FROM information_schema.schemata;')
#> schema_name
#> 1 pg_toast
#> 2 pg_catalog
#> 3 public
#> 4 information_schema
#> 5 Robert
dbGetQuery(con, 'SELECT * FROM pg_catalog.pg_namespace;')
#> oid nspname nspowner nspacl
#> 1 99 pg_toast 10 <NA>
#> 2 11 pg_catalog 10 {postgres=UC/postgres,=U/postgres}
#> 3 2200 public 10 {postgres=UC/postgres,=UC/postgres}
#> 4 13111 information_schema 10 {postgres=UC/postgres,=U/postgres}
#> 5 16417 Robert 10 <NA>
The schema is also listed by dbListObjects() after adding a table
table_qid <- dbQuoteIdentifier(con, Id(schema = 'Robert', table = "mtcars"))
dbWriteTable(con, table_qid, mtcars)
dbListObjects(con)
#> table is_prefix
#> 1 <Id> schema = information_schema TRUE
#> 2 <Id> schema = Robert TRUE
#> 3 <Id> schema = pg_catalog TRUE
# clean-up
dbExecute(con, paste0('DROP TABLE ', table_qid))
#> [1] 0
dbExecute(con, paste0('DROP SCHEMA "Robert"'))
#> [1] 0
dbDisconnect(con)
The reason for this is that dbListObjects() uses information_schema.tables to look up tables and schemata, but naturally the information_schema.tables view does not list schemata if they do not contain a table.
Session info
sessioninfo::session_info()
#> - Session info ---------------------------------------------------------------
#> setting value
#> version R version 4.1.2 (2021-11-01)
#> os Windows 10 x64 (build 19043)
#> system x86_64, mingw32
#> ui RTerm
#> language en
#> collate German_Germany.1252
#> ctype German_Germany.1252
#> tz Europe/Berlin
#> date 2021-12-30
#> pandoc 2.16.2 @ C:/PROGRA~1/Pandoc/ (via rmarkdown)
#>
#> - Packages -------------------------------------------------------------------
#> package * version date (UTC) lib source
#> backports 1.4.1 2021-12-13 [1] CRAN (R 4.1.2)
#> bit 4.0.4 2020-08-04 [1] CRAN (R 4.1.0)
#> bit64 4.0.5 2020-08-30 [1] CRAN (R 4.1.0)
#> blob 1.2.2 2021-07-23 [1] CRAN (R 4.1.0)
#> cli 3.1.0 2021-10-27 [1] CRAN (R 4.1.1)
#> crayon 1.4.2 2021-10-29 [1] CRAN (R 4.1.1)
#> DBI 1.1.2 2021-12-20 [1] CRAN (R 4.1.2)
#> digest 0.6.29 2021-12-01 [1] CRAN (R 4.1.2)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0)
#> fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0)
#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0)
#> fs 1.5.2 2021-12-08 [1] CRAN (R 4.1.2)
#> generics 0.1.1 2021-10-25 [1] CRAN (R 4.1.1)
#> glue 1.6.0 2021-12-17 [1] CRAN (R 4.1.2)
#> highr 0.9 2021-04-16 [1] CRAN (R 4.1.0)
#> hms 1.1.1 2021-09-26 [1] CRAN (R 4.1.1)
#> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.1)
#> knitr 1.37 2021-12-16 [1] CRAN (R 4.1.2)
#> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.1)
#> lubridate 1.8.0 2021-10-07 [1] CRAN (R 4.1.1)
#> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0)
#> pillar 1.6.4 2021-10-18 [1] CRAN (R 4.1.1)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.1.0)
#> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.1.0)
#> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.1.0)
#> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.1.0)
#> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.1.1)
#> Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0)
#> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.0)
#> rlang 0.4.12 2021-10-18 [1] CRAN (R 4.1.1)
#> rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.1.1)
#> RPostgres * 1.4.3 2021-12-20 [1] CRAN (R 4.1.2)
#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.2)
#> stringi 1.7.6 2021-11-29 [1] CRAN (R 4.1.2)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.0)
#> styler 1.6.2 2021-09-23 [1] CRAN (R 4.1.1)
#> tibble 3.1.6 2021-11-07 [1] CRAN (R 4.1.2)
#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0)
#> vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0)
#> withr 2.4.3 2021-11-30 [1] CRAN (R 4.1.2)
#> xfun 0.29 2021-12-14 [1] CRAN (R 4.1.2)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0)
#>
#> [1] C:/Users/Daniel.AK-HAMBURG/Documents/R/win-library/4.1
#> [2] C:/Program Files/R/R-4.1.2/library
#>
#> ------------------------------------------------------------------------------
Thanks. I wonder if it's worth fixing here.
In dm I'm working on an abstraction of INFORMATION_SCHEMA that works across data sources: https://github.com/cynkra/dm/pull/517. I need to sit down and put the bits and pieces together. Feedback welcome.
I wonder if it's worth fixing here.
Agree, the practical relevance is rather limited (though not entirely absent e.g. like in my reprex: creating a schema, then checking whether it's there). Maybe I can tackle this together with #251 / #261.
In dm I'm working on an abstraction of
INFORMATION_SCHEMAthat works across data sources: cynkra/dm#517. I need to sit down and put the bits and pieces together. Feedback welcome.
This certainly looks interesting, but I'd have to dig in deeper to add something meaningful and cannot promise if/when that'll happen.
I'll just state the (now) obvious, that INFORMATION_SCHEMA may not contain engine-specific extensions (e.g. materialized views in postgres, cf. #251), though I am not sure in how relevant this is for what you have in mind there.
Note to self: We need to filter pg_toast* and pg_temp* schemata, (should we want to address this).
Details
From an "empty" DB after running the RPostgres tests (i.e. adding and removing some tables).> dbListObjects(pg_con)
table is_prefix
1 <Id> schema = information_schema TRUE
2 <Id> schema = pg_catalog TRUE
> dbGetQuery(pg_con, 'SELECT * FROM pg_catalog.pg_namespace;')
oid nspname nspowner nspacl
1 99 pg_toast 10 <NA>
2 11 pg_catalog 10 {postgres=UC/postgres,=U/postgres}
3 2200 public 6171 {pg_database_owner=UC/pg_database_owner,=U/pg_database_owner}
4 13186 information_schema 10 {postgres=UC/postgres,=U/postgres}
5 16388 pg_temp_4 10 <NA>
6 16389 pg_toast_temp_4 10 <NA>
7 16395 pg_temp_5 10 <NA>
8 16396 pg_toast_temp_5 10 <NA>
9 18767 pg_temp_10 10 <NA>
10 18768 pg_toast_temp_10 10 <NA>
11 18824 pg_temp_7 10 <NA>
12 18825 pg_toast_temp_7 10 <NA>
13 20812 pg_temp_9 10 <NA>
14 20813 pg_toast_temp_9 10 <NA>
15 27016 pg_temp_3 10 <NA>
16 27017 pg_toast_temp_3 10 <NA>
17 27024 Robert 10 <NA>
> dbGetQuery(pg_con, 'SELECT schema_name FROM information_schema.schemata;')
schema_name
1 public
2 Robert
3 pg_toast_temp_3
4 pg_temp_3
5 pg_toast_temp_9
6 pg_temp_9
7 pg_toast_temp_7
8 pg_temp_7
9 pg_toast_temp_10
10 pg_temp_10
11 pg_toast_temp_5
12 pg_temp_5
13 pg_toast_temp_4
14 pg_temp_4
15 information_schema
16 pg_catalog
17 pg_toast
Then again, we may need to list the pg_temp_* schema(ta?) that contain user-created temporary tables (dbListObjects() does currently does).
We cannot tell from information_schema.schemata which one that is (AFAIK). So we'd probably need to do something like this:
SELECT schema_name AS schema, NULL::varchar AS table
FROM information_schema.schemata
WHERE schema_name NOT LIKE 'pg_toast%' AND schema_name NOT LIKE 'pg_temp%'
UNION
SELECT DISTINCT table_schema AS schema, NULL::varchar AS table
FROM (SELECT table_schema, table_name FROM information_schema.tables) as schema_query
When using the system catalogs, we can get the temp schema with also with:
SELECT nspname
FROM pg_namespace
WHERE oid = pg_my_temp_schema();
The question remains, whether returning the temp schema is really necessary/useful, see https://dba.stackexchange.com/a/76516.