ruby-duckdb icon indicating copy to clipboard operation
ruby-duckdb copied to clipboard

DuckDB::Column duckdb_logical_column_type

Open suketa opened this issue 1 year ago • 9 comments

implement duckdb_logical_column_type CAPI

suketa avatar Jun 23 '24 10:06 suketa

@suketa May I try to implement this issue step by step if you don't have the plan to implement it soon?

otegami avatar Dec 04 '24 09:12 otegami

@otegami

Thank you!!! Yes, please.

suketa avatar Dec 06 '24 20:12 suketa

@suketa

Thank you so much. I'd like to know what kind of API you are expecting for retrieving a column’s logical type. Could you share your thoughts with me?

I'm expecting that DuckDB::Column#logical_type returns a human-readable string representation of the logical type. If my expectation is wrong. could you tell me? 🙏🏾

For example:

# For a decimal column with precision and scale
# decimal_col: DECIMAL(18, 3)
decimal_col.logical_type # => "DECIMAL(18,3)"

# For a list column
# int_list_col: INT[]
int_list_col.logical_type # => "INT[]"

# For a struct column with multiple fields
# struct_col: STRUCT(word VARCHAR, length INTEGER)
struct_col.logical_type # => "STRUCT(word VARCHAR, length INTEGER)"

# For a map column
# map_col: MAP(INTEGER, VARCHAR)
map_col.logical_type # => "MAP(INTEGER, VARCHAR)"

otegami avatar Dec 09 '24 09:12 otegami

@otegami

I have not decided what kind of API is enough or better yet, but I think ruby interface should be thin wrapper of duckdb_logical_column_type.

duckdb_column_logical_type definition is the following:

DUCKDB_API duckdb_logical_type duckdb_column_logical_type(duckdb_result *result, idx_t col);

So, ruby interface is

duckdb_result.column_logical_type(column_index) #=> DuckDB::LogicalType Ruby object wrapping duckdb_logical_type  C struct.

or

decimal_col.logical_type #=> DuckDB::LogicalType Ruby object wrapping duckdb_logical_type C struct.

and we can write like as the following

decimal_ltype = decimal_col.logical_type
decimal_ltype.alias #=> String by getting `duckdb_logical_type_get_alias` C-API.
decimal_ltype.decimal_width #=> Integer by getting `duckdb_decimal_width` C-API.
decimal_ltype.decimal_scale #=> Integer by getting `duckdb_decimal_scale` C-API.

enum_ltype = enum_col.logical_type 
enum_ltype.enum_dictionary_size #=> Integer by getting `duckdb_enum_dictionary_size` C-API.

...etc.

And, to_string (or something, I have not decided the method name yet.) is returning string representing the column type.

decimal_col.logical_type.to_string # => "DECIMAL(18,3)"

suketa avatar Dec 13 '24 23:12 suketa

memo:

Should DuckDB::LogicalType.new(1) return DuckDB::LogicalType object representing boolean type by using duckdb_create_logical_type ?

suketa avatar Dec 13 '24 23:12 suketa

Thank you for sharing your idea. I now understand the concept behind implementing duckdb_logical_column_type. I’ll proceed with your following suggested approach step by step.

decimal_col.logical_type #=> DuckDB::LogicalType Ruby object wrapping duckdb_logical_type C struct.

otegami avatar Dec 15 '24 00:12 otegami

I would like to proceed with implementing the DuckDB::LogicalType wrapper step by step as outlined below:

  • [x] Implement the DuckDB::LogicalType Class
  • For Each Logical Type, Perform the Following:
    • e.g. Decimal Type:
      • [x] Implement the interface functions:
        • uint8_t duckdb_decimal_width(duckdb_logical_type type);
        • uint8_t duckdb_decimal_scale(duckdb_logical_type type);
      • [x] Ensure DuckDB::Column#logical_type returns an instance of DuckDB::LogicalType for Decimal type.
  • Extend the Implementation to Other Data Types:
    • [x] List
    • [x] Array
    • [x] Map
    • [x] Union
    • [x] Struct
    • [x] Enum

Please let me know if this approach aligns with your expectations or if there are any adjustments you would recommend :pray:

otegami avatar Dec 15 '24 07:12 otegami

@otegami

I would like to proceed with implementing the DuckDB::LogicalType wrapper step by step as outlined below:

Thank you, go ahead :+1:

suketa avatar Dec 15 '24 09:12 suketa

I've updated the current situation as follows. ref: https://duckdb.org/docs/stable/clients/c/api.html#logical-type-interface

  • [ ] duckdb_logical_type duckdb_create_logical_type(duckdb_type type);
  • [x] char *duckdb_logical_type_get_alias(duckdb_logical_type type);
  • [x] void duckdb_logical_type_set_alias(duckdb_logical_type type, const char *alias);
  • [ ] duckdb_logical_type duckdb_create_array_type(duckdb_logical_type type, idx_t array_size);
  • [ ] duckdb_logical_type duckdb_create_list_type(duckdb_logical_type type);
  • [ ] duckdb_logical_type duckdb_create_map_type(duckdb_logical_type key_type, duckdb_logical_type value_type);
  • [ ] duckdb_logical_type duckdb_create_union_type(duckdb_logical_type *member_types, const char **member_names, idx_t member_count);
  • [ ] duckdb_logical_type duckdb_create_struct_type(duckdb_logical_type *member_types, const char **member_names, idx_t member_count);
  • [ ] duckdb_logical_type duckdb_create_enum_type(const char **member_names, idx_t member_count);
  • [ ] duckdb_logical_type duckdb_create_decimal_type(uint8_t width, uint8_t scale);
  • [x] duckdb_type duckdb_get_type_id(duckdb_logical_type type);
  • [x] uint8_t duckdb_decimal_width(duckdb_logical_type type);
  • [x] uint8_t duckdb_decimal_scale(duckdb_logical_type type);
  • [x] duckdb_type duckdb_decimal_internal_type(duckdb_logical_type type);
  • [x] duckdb_type duckdb_enum_internal_type(duckdb_logical_type type);
  • [x] uint32_t duckdb_enum_dictionary_size(duckdb_logical_type type);
  • [x] char *duckdb_enum_dictionary_value(duckdb_logical_type type, idx_t index);
  • [x] duckdb_logical_type duckdb_list_type_child_type(duckdb_logical_type type);
  • [x] duckdb_logical_type duckdb_array_type_child_type(duckdb_logical_type type);
  • [x] idx_t duckdb_array_type_array_size(duckdb_logical_type type);
  • [x] duckdb_logical_type duckdb_map_type_key_type(duckdb_logical_type type);
  • [x] duckdb_logical_type duckdb_map_type_value_type(duckdb_logical_type type);
  • [x] idx_t duckdb_struct_type_child_count(duckdb_logical_type type);
  • [x] char *duckdb_struct_type_child_name(duckdb_logical_type type, idx_t index);
  • [x] duckdb_logical_type duckdb_struct_type_child_type(duckdb_logical_type type, idx_t index);
  • [x] idx_t duckdb_union_type_member_count(duckdb_logical_type type);
  • [x] char *duckdb_union_type_member_name(duckdb_logical_type type, idx_t index);
  • [x] duckdb_logical_type duckdb_union_type_member_type(duckdb_logical_type type, idx_t index);
  • [ ] void duckdb_destroy_logical_type(duckdb_logical_type *type);
  • [ ] duckdb_state duckdb_register_logical_type(duckdb_connection con, duckdb_logical_type type, duckdb_create_type_info info);

otegami avatar Apr 08 '25 10:04 otegami

@suketa Could you share your thoughts on implementing bindings for the duckdb_create_* functions?
I don’t think we need to wrap them right now, since all logical types currently come straight from the database metadata or query's results.
We can always revisit and add those wrappers later if we find the client-side use-cases for custom types.
What do you think?

otegami avatar May 07 '25 10:05 otegami

@otegami

Thank you for your suggestion. I agree with you.

Could you share your thoughts on implementing bindings for the duckdb_create_* functions?

I'm not sure but I think we could write like as the following Ruby code by implementing the bindings.

duckdb_logical_type = DuckDB::LogicalType.create(DuckDB::LogicalType::List) 

# convert Ruby array to list type of duckdb data.
duckdb_value = DuckDB::Value.create([1,2,3], duckdb_logical_type)

prepared_statement.bind_value(duckdb_value)

I'll create the new issue and close this issue.

suketa avatar May 09 '25 21:05 suketa

#940 is created and close this issue.

suketa avatar May 09 '25 21:05 suketa

Thank you so much for sharing your idea! It looks nice to me, too. I was thinking that I would implement the classes that handled each logical type as follows. But returning DuckDB::LogicalType instance is a bit tricky, so I was stuck in implementing it...

duckdb_logical_type = DuckDB::LogicalType::List.new([1,2,3]) 
#> Returns DuckDB::LogicalType instance

otegami avatar May 19 '25 09:05 otegami