datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Create a scalar from array of type Map

Open dadepo opened this issue 2 years ago • 4 comments

Is your feature request related to a problem or challenge?

I am not sure if this is a bug, feature request or things are working as it should.

How can I fix an error of this kind

Background, I have a UDF that mimics json_build_object from postgres. This UDF returns a MapArray as the returning value.

With a table like

let df = ctx.sql(r#"
select * from test"#).await?;
df.show().await?;

+-----------+-----------+-----------+
| clientid  | name      | parentid  |
+-----------+-----------+-----------+
| c-string1 | n-string1 | p-string1 |
| c-string2 | n-string2 | p-string2 |
+-----------+-----------+-----------+

I can build a json like this fine

        let df = ctx.sql(r#"
        select json_build_object('name', name, 'type', 'test')
        from test"#).await?;
        df.show().await?;

+---------------------------------------------------------------------+
| json_build_object(Utf8("name"),test.name,Utf8("type"),Utf8("test")) |
+---------------------------------------------------------------------+
| {name: n-string1, type: test}                                       |
| {name: n-string2, type: test}                                       |
+---------------------------------------------------------------------+

But if I remove name or any other column that exist in the table, I get the error

        let df = ctx.sql(r#"
        select json_build_object('type', 'test')
        from test"#).await?;
        df.show().await?;

Error: This feature is not implemented: Can't create a scalar from array of type "Map(Field { name: "entries", data_type: Struct([Field { name: "keys", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "values", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, false)"

A simple work around will be to always include a column from the table in the query, but unfortunately i cannot guarantee this. Especially when I have use case where json_build_object works on computed values or values created via other udfs.

How can I make my UDF not dependent on the present of a column? For example I can use the make_array function in this way. That is, I can use it without including a column from the table

        let df = ctx.sql(r#"
        select make_array('type', 'test')
        from test"#).await?;
        df.show().await?;

+--------------------------------------+
| makearray(Utf8("type"),Utf8("test")) |
+--------------------------------------+
| [type, test]                         |
| [type, test]                         |
+--------------------------------------+

Describe the solution you'd like

Ability to have a UDF not dependent on the present of a column, generate as much results in a column as needed in the resulting record batch

Describe alternatives you've considered

if I change the return type to UTF8 instead of Map, that is

let json_build_object_return_type: ReturnTypeFunction = Arc::new(move |_| Ok(Arc::new(DataType::Utf8)));

And i return StringArray from the UDF, I get the behaviour I am looking for. But this means the json structure returned from the UDF get formatted into strings, which makes processing more difficult down the line as I end up with values like this

let value = r#"\"{\\\"key_one\\\":\\\"val_one\\\"}\""#;

Additional context

No response

dadepo avatar May 30 '23 12:05 dadepo

Maybe related to #11128 ? Seems similar to make_map udf, and make_map can't support the usage in this issue. Trying to support it like make_array.

Rachelint avatar Jul 12 '24 14:07 Rachelint

take

Rachelint avatar Jul 13 '24 21:07 Rachelint

@Rachelint are you still working on this? If not I would like to take it.

devanbenz avatar Aug 25 '24 17:08 devanbenz

@Rachelint are you still working on this? If not I would like to take it.

No, I am not working, but it seems similar as #11128? And I have fixed #11128.

If I misunderstand about it, feel free to pick this.

Rachelint avatar Aug 25 '24 17:08 Rachelint