duckdb-rs icon indicating copy to clipboard operation
duckdb-rs copied to clipboard

Issue with Mutable Borrow of Statement when Accessing Column Names (Option::unwrap())

Open ravi-nallappan opened this issue 1 year ago • 6 comments

I encountered a problem similar to a previously closed issue (#204). However, in my use case, I need to execute arbitrary select-SQL and generate a vector of maps, where each map represents a row with column name/value pairs.

Code Snippet:

use duckdb::types::Value as DuckValue;
use duckdb::AccessMode as DuckAccessMode;
use duckdb::Config as DuckConfig;
use duckdb::Connection as DuckConnection;
use duckdb::Error as DuckError;
use duckdb::Row as DuckRow;
use duckdb::Rows as DuckRows;
use fallible_iterator::FallibleIterator;

    fn query_duck(&self, sql: &str) -> Result<Vec<IndexMap<String, String>>> {
        let mut statement = self.duck_connection.prepare(sql)?;
        let names = statement
            .column_names()
            .into_iter()
            .map(String::from)
            .collect_vec();
        let value_to_text = |v: DuckValue| match v {
            DuckValue::Null => String::from("null"),
            DuckValue::BigInt(i) => i.to_string(),
            DuckValue::TinyInt(i) => i.to_string(),
            DuckValue::Double(d) => d.to_string(),
            DuckValue::Text(t) => t,
            DuckValue::Blob(v) => v.into_iter().map(|u| u.to_string()).join(","),
            _ => "".to_string(),
        };
        let row_to_vectuple = |r: &DuckRow| -> Result<Vec<(String, String)>, DuckError> {
            names
                .iter()
                .map(String::clone)
                .map(|n| r.get::<_, DuckValue>(n.as_str()).and_then(|v| Ok((n, v))))
                .map_ok(|(n, v)| (n, value_to_text(v)))
                .collect::<Result<Vec<_>, _>>()
        };
        let vectuple_to_map = |r: Vec<(String, String)>| -> IndexMap<String, String> {
            r.into_iter().collect::<IndexMap<_, _>>()
        };
        let rows: DuckRows<'_> = statement.query([])?;
        let result: Vec<Vec<(String, String)>> = rows.map(row_to_vectuple).collect()?;
        let result = result.into_iter().map(vectuple_to_map).collect_vec();
        Ok(result)
    }

Problem Description: The issue arises because the statement is mutably borrowed during the time when column names need to be accessed.

Environment:

  • DuckDB version: { version = "0.9.2", features = ["json"] }
  • Rust version: 1.71.1 (eb26296b5 2023-08-03)
  • OS: Ubuntu 22.04

Could you suggest a workaround for this issue? Additionally, is this behavior intentional for reasons I might not be aware of?

Thank you for your assistance and for maintaining this project.

ravi-nallappan avatar Jan 05 '24 19:01 ravi-nallappan

I ran into this exact issue. There's a catch-22 if you don't know the number of columns in advance. You can't get the column count without executing the query, and you can't execute the query without knowing the column count (because you don't know which columns to get).

Timmmm avatar Feb 09 '24 09:02 Timmmm

@ravi-nallappan @Timmmm I bimped into same issue when I was doing some tinkering...I managed to find a solition by asking on reddit https://www.reddit.com/r/rust/s/UqXiz5BNaR

The trick is to call row.as_ref().column_names() once you matched single value with something like match row.next()... prior to getting column names. Hope it helps!

nikola-maric avatar Feb 18 '24 21:02 nikola-maric

That worked for me too. This is an issue with the duckdb api because doing the same in rusqlite works fine:

    let mut stmt = conn.prepare("SELECT id, name, data FROM person")?;
    let cols: Vec<String> = stmt.column_names().iter().map(|&s| s.to_string()).collect();
    println!("cols: {:?}", cols);
    let mut rows = stmt.query([])?;

danthegoodman1 avatar May 19 '24 15:05 danthegoodman1

I did a deep dive into this while working on #333 as I ran into a similar issue. The main problem is that all functions that depend on RawStatement::schema panic when the underlying schema is None, and the schema is Some if it has been executed. There is actually a (hacky) way around this if you're willing to use the DuckDB C bindings, as you can call duckdb_prepare_arrow_schema and retrieve a schema for that prepared statement, but it's very incomplete (only gives you the amount of the columns, not the names or datatypes. I think it's an issue with the upstream C bindings). It should suffice for that use case though!

abhiaagarwal avatar Jun 07 '24 20:06 abhiaagarwal

FWIW you can access the rows later in the row iterator:

let mut rows = stmt.query([]).unwrap();
    while let Some(row) = rows.next().unwrap() {
        let cols = row.as_ref().column_names();
        // ...

danthegoodman1 avatar Jun 08 '24 00:06 danthegoodman1

Hi team, is there update on this issue? I checked the latest codes on main branch, it will still fail when I try to get any schema related information before I execute the statement. and I just can't execute the stmt first because of the immutable and mutable references on the stmt exist in the same block

pengye91 avatar Jul 23 '24 02:07 pengye91