lua icon indicating copy to clipboard operation
lua copied to clipboard

Deepcast empty table to empty list

Open anjou-low opened this issue 7 months ago • 8 comments

Hi, I've come accross the need to deepcast a nested LUA table that might contain empty tables which should be interpreted either as empty lists or empty maps. There does not seem to be a way to make a distinction yet since the deepcast heuristic does not work on empty table.

For example I'd like to be able to write something along the lines of

{_, env} = Lua.new() |> Lua.eval!("my_object = {my_empty_list = empty_list(), my_empty_map = {}}")
table = Lua.get!(env, [:my_object])
Lua.decode!(env, table) |> Lua.Table.deep_cast()
# %{"my_empty_list" => [], "my_empty_map" => %{}}

One option I see would be to inject a special value via the call to empty_list that gets pattern matched to an empty list in deep_cast but maybe there is a better way?

Thanks !

anjou-low avatar May 19 '25 07:05 anjou-low

The issue here is that there isn't a concept of a list in Lua, it's just a table with integer keys starting at 1, by convention.

The only option here would be to allow you to choose the encoding based on the key.

WDYT?

davydog187 avatar May 19 '25 13:05 davydog187

Thanks for your answer and sorry for the delay.

Perhaps I should add some context about why I need to do that; the goal is to have user-provided scripts that can specify a JSON object via their returned table (maybe the better option is to serialize to a string directly in LUA and deserialize it back in Elixir but I'd like to see if there is an alternative).

So if I understand correctly your idea, the user would have to provide their own list of which keys in the table need to be encoded as empty lists, say

return {foo = {}, bar = {}, buzz = 42, empty_lists = {"foo", "buzz"}}

This might be difficult to maintain, especially for nested tables.

After some discussions, one option that came up from Allow differentiation of arrays and objects for proper empty-object serialization was to store the fact that a table needs to be treated as an empty list in Elixir on the metatable. I am not sure if this is possible, let alone a good idea as this would probably move things deep inside luerl?

What do you think about the possibility to specify a static value to deep_cast that gets interpreted as an empty list in the pattern matching, something like

  def deep_cast(value, opts \\ []) do
    empty_list_token = Keyword.get(opts, :empty_list_token, nil)

    case value do
      [{1, _val} | _rest] = list ->
        Enum.map(list, fn
          {_, v} when is_list(v) ->
            deep_cast(v)

          {_, v} ->
            if not is_nil(empty_list_token) and v == empty_list_token do
              []
            else
              v
            end
        end)

      map ->
        Map.new(map, fn
          {k, v} when is_list(v) ->
            {k, deep_cast(v)}

          {k, v} ->
            if not is_nil(empty_list_token) and v == empty_list_token do
              {k, []}
            else
              {k, v}
            end
        end)
    end

anjou-low avatar May 21 '25 16:05 anjou-low

@anjou-low given that empty_list_token could be any Elixir term, I don't see how this would work given that decoded Lua tables will only have encodeable values in them (and thus no atoms, only nil allowed)

Another option for you is that we could have the caller of Lua.Table.deep_cast/1 specify whether or not to treat empty tables as lists or maps. The obvious issue here being that it would uniformly treat all empty tables the same way, no matter which it is.

I'm not clear on your usecase, but you could also get clever with the new Lua.set_private/3 API, storing references to tables and having an internal datastructure that decides how to cast each table based on each reference. I wouldn't want to officially support that, but depending on your usecase, could be possible.

After some discussions, one option that came up from https://github.com/mpx/lua-cjson/issues/11 was to store the fact that a table needs to be treated as an empty list in Elixir on the metatable. I am not sure if this is possible, let alone a good idea as this would probably move things deep inside luerl?

I find it unlikely that this would be accepted into Luerl, but you could ask in the Slack / Discord

davydog187 avatar May 21 '25 17:05 davydog187

@anjou-low ping :)

davydog187 avatar Jun 03 '25 17:06 davydog187

Hi, sorry... So we decided to serialize the table to JSON directly in LUA with a library that supports differentiating between empty objects and empty arrays, i.e. {nil} will be interpreted as [].

Maybe I should clarify what we are doing for the sake of completeness but it is ok for me if you close the issue. The goal is to allow users to define webhooks with custom logic in LUA; their scripts should output a JSON response's body and sometimes the distinction between returning {} or [] is important depending of the service calling the webhook.

Thanks for your help !

anjou-low avatar Jun 04 '25 05:06 anjou-low

Thanks @anjou-low

I opened https://github.com/tv-labs/lua/pull/84, does this help?

davydog187 avatar Jun 04 '25 11:06 davydog187

I think the problem with this approach in our situation is that it requires the writer of the script to maintain the empty_encoder, i.e they have to provide us a way to construct the function to use when we deep_cast their table since we don't know in advance the structure of their data.

This is not something we want to do as this would make scripts too hard to maintain for the end users, especially considering nested keys, i.e. differentiating between ["a"] and ["b", "a"] in the table

{a = {}, b = {a = {}}}

While this could be achieved by passing a list of keys to the empty_encoder, it would quickly make it hard to deal with.

We will stick with doing the JSON serialization on the LUA side for now. I suppose there could be a use for the empty_encoder option in deep_cast but we don't have one yet and if we are the only ones who might need it, I don't want to be responsible for any extra work this might cause you :stuck_out_tongue:

So feel free to close the issue as is and again thanks for your help!

anjou-low avatar Jun 04 '25 19:06 anjou-low

@anjou-low I think this the wrong way to think about the problem. I agree that your users shouldn't be worried about serialization, and it is instead up to your application to decide how to address this problem, either through schemas, a convention, or some heuristic.

I will move forward with the ability for users to provide a function for deciding how to encode empty values, and leave it there.

davydog187 avatar Jun 22 '25 15:06 davydog187