DBI::dbConnect fails on second call to dbConnect() with file is being used by another process
What happens?
When attempting to reconnect to a duckdb in the same R session, the connect results in a failure reporting 'The process cannot access the fiel because it is being used by another process.'
To Reproduce
The R script:
con <- DBI::dbConnect(duckdb::duckdb(), file.path("test.duckdb"))
DBI::dbDisconnect(con)
con <- DBI::dbConnect(duckdb::duckdb(), file.path("test.duckdb"))
DBI::dbDisconnect(con)
The output:
Warning message:
package ‘duckdb’ was built under R version 4.1.3
> con <- DBI::dbConnect(duckdb::duckdb(), file.path("test.duckdb"))
> DBI::dbDisconnect(con)
> con <- DBI::dbConnect(duckdb::duckdb(), file.path("test.duckdb"))
Error: duckdb_startup_R: Failed to open database: IO Error: Cannot open file "test.duckdb": The process cannot access the file because it is being used by another process.
> DBI::dbDisconnect(con)
Warning message:
Connection already closed.
Environment (please complete the following information):
- OS: [Windows]
- DuckDB Version: [0.3.2-2]
- DuckDB Client: [R]
What I'm trying to accomplish
We have a test suite where there's a method that will initialize a database instance (using a temp file) and then hand off the call to the test methods (which will start a new connection to the database after the initialize step closed their connection). The above script is the simplest case I could show where it does a connect, disconnect and then fails to connect again.
Add argument shutdown=TRUE, i.e. use DBI::dbDisconnect(con, shutdown=TRUE)
Thank you. 2 questions:
- How would you pass the additional parameter
shutdown=TRUEto dbDisconnect() when usingwithr:
conn <- withr::local_db_connection(duckdb::dbConnect(duckdb::duckdb(), dbFile)) ;
In case you're not familiar the withr::local_db_connection will take the connection returned from the first param and close it after the local context ends (such as the function it was created in).
- Under what circumstances would you not use shutdown (or shutdown=FALSE)? It you can't connect to the db again if you don't use
shutdown=T, then why isn't that the default behavior?
Answering my first question, you use defer:
withr::defer(DBI::dbDisconnect(conn, shutdown=TRUE))
This will close the connection with the additional param shutdown=TRUE at the end of the invoking function.
Yes, defer() is a good option if on.exit() within a function is not enough.
For the second question, I agree that it would be nice to have shutdown=TRUE as default. I haven't checked the internals for the reason of current behavior that carefully, but it seems that getting things to work with DBI-interface in R has been challenging (#323) and shutdown must be done for the driver as disconnect only closes the connection.
For the second question, I agree that it would be nice to have
shutdown=TRUEas default. I haven't checked the internals for the reason of current behavior that carefully, but it seems that getting things to work with DBI-interface in R has been challenging (#323) and shutdown must be done for the driver as disconnect only closes the connection.
+1 for making shutdown=TRUE the default in dbDisconnect as it seems like the most common case and the expected behavior of dbDisconnect(con). Thank you very much for implementing DBI in R. What has been challenging about getting DBI to work? Do you need to run some shutdown code after the last connection is closed? I think reg.finalizer() can be used for that.
Has there been any more discussion on making shutdown=TRUE default behavior? What this would do is make the code DBI::dbDisconnect(con) work the same for all DBI connections including duckdb.
I propose to auto-shutdown by default (i.e., even with shutdown = FALSE), but only if it's safe to shut down.
From https://github.com/duckdb/duckdb/pull/5525#issuecomment-1469450431:
This is an example that shows how we could detect if the duckdb instance behind the connection is safe to shut down. The "connection" in this example is just a flag that indicates if the "duckdb instance" has been bound to another symbol when this was called.
@hannes: Should we follow this route?
cpp11::cpp_source(
code = '#include "cpp11/list.hpp"
[[cpp11::register]]
SEXP connect(SEXP x) {
auto shared = MAYBE_SHARED(x);
SEXP out = Rf_allocVector(LGLSXP, 1);
LOGICAL(out)[0] = shared;
return out;
}
'
)
duckdb <- function() {
list()
}
disconnect <- function(con) {
if (con) {
message("May be shared, can't shut down")
} else {
message("Not shared, safe to disconnect")
}
}
con1 <- connect(duckdb())
disconnect(con1)
#> Not shared, safe to disconnect
drv1 <- duckdb()
con2 <- connect(drv1)
disconnect(con2)
#> May be shared, can't shut down
con2 <- connect(drv2 <- duckdb())
disconnect(con2)
#> May be shared, can't shut down
Created on 2023-03-15 with reprex v2.0.2
@krlmlr I think this would be very nice for users. I see this message a lot. :) Thank you!!
Turns out this is a tad more difficult than anticipated. We need a different way to detect sharedness here:
cpp11::cpp_source(
code = '#include "cpp11/list.hpp"
[[cpp11::register]]
SEXP is_shared(SEXP x) {
auto shared = MAYBE_SHARED(x);
return Rf_ScalarLogical(shared);
}
'
)
duckdb <- function() {
list()
}
disconnect <- function(con) {
if (con) {
message("May be shared, can't shut down")
} else {
message("Not shared, safe to disconnect")
}
}
connect <- function(drv) {
is_shared(drv)
}
con1 <- connect(duckdb())
disconnect(con1)
#> May be shared, can't shut down
Created on 2023-07-12 with reprex v2.0.2
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days.
This issue was closed because it has been stale for 30 days with no activity.