protox icon indicating copy to clipboard operation
protox copied to clipboard

[Feature request] support `extend` when nested inside a message definition

Open skwerlman opened this issue 2 years ago • 9 comments

I have the following (simplified) proto2 definition:

syntax = "proto2";

message SessionCommand {
    enum SessionCommandType {
        PING = 1000;
        // etc
    }
    extensions 100 to max;
}

message Command_Ping {
    extend SessionCommand {
        optional Command_Ping ext = 1000;
    }
}

// etc

When decoding a Command_Ping (when i don't know what message type i'm looking at) I get the following struct:

iex> Proto.SessionCommand.decode!(<<194, 62, 0>>)
%Proto.SessionCommand{__uf__: [{1000, 2, ""}]}

I then have to detect and correctly decode the extension with something like this:

defp decode(cmd, encoded) do
  [{id, _, _}] = SessionCommand.unknown_fields(cmd)

  case SessionCommandType.decode(id) do
    :PING ->
      Command_Ping.decode!(encoded)
  end
end

It would be much nicer if the SessionCommand decoder had a way to know what messages extend it, and then automatically decode to the correct struct (or decode them into some well-known field on the SessionCommand)

skwerlman avatar Jan 22 '23 19:01 skwerlman

Hello, I have to think more about this, but I'm not sure it's feasible 🤔. Indeed, there's nothing in a message that says it's been extended, there are just more fields from an extension. Even though these new fields will have identifiers which are not in the base message, they can overlap between extensions (even if it's not recommended), thus making it impossible to find the extension they are coming from.

ahamez avatar Jan 26 '23 21:01 ahamez

Hello again,

I can't reproduce your problem:

iex(1)> defmodule MyModule do
...(1)>   use Protox, schema: """
...(1)>   syntax = "proto2";
...(1)>
...(1)>   message SessionCommand {
...(1)>       enum SessionCommandType {
...(1)>           PING = 1000;
...(1)>           // etc
...(1)>       }
...(1)>       extensions 100 to max;
...(1)>   }
...(1)>
...(1)>   message Command_Ping {
...(1)>   }
...(1)>
...(1)>   extend SessionCommand {
...(1)>       optional Command_Ping ext = 1000;
...(1)>   }
...(1)>   """
...(1)> end
...
iex(2)> SessionCommand.decode!(<<194, 62, 0>>)
%SessionCommand{ext: %Command_Ping{__uf__: []}, __uf__: []}

Command_Ping is directly accessible in the ext field. So I'm not sure where the problem is?

ahamez avatar Jan 27 '23 12:01 ahamez

it looks like you're using a different definition for Command_Ping than i am:

// mine, not working
message Command_Ping {
    extend SessionCommand {
        optional Command_Ping ext = 1000;
    }
}

// yours, working
message Command_Ping {
}

extend SessionCommand {
    optional Command_Ping ext = 1000;
}

skwerlman avatar Jan 27 '23 23:01 skwerlman

i did a bit of testing, and it seems like the issue has to do with how code is generated when the definition is nested like i have it, since it affects encoding too

running this:

SessionCommand.encode!(%SessionCommand{ext: %Command_Ping{}}) |> IO.iodata_to_binary()

returns <<194, 62, 0>> when run with the non-nested definition, but the nested definition raises with

** (KeyError) key :ext not found
    expanding struct: SessionCommand.__struct__/1
    iex:6: (file)
    (elixir 1.14.2) expanding macro: Kernel.|>/2
    iex:6: (file)

In order to get the same output with the nested definition, i had to run

Command_Ping.encode!(%Command_Ping{ext: %Command_Ping{}}) |> IO.iodata_to_binary()

If you need it, the full set of proto defs i am working with is here: https://github.com/Cockatrice/Cockatrice/tree/master/common/pb The Command_Ping example is from here

skwerlman avatar Jan 27 '23 23:01 skwerlman

OK, thanks, with all this information it will be easier to pinpoint the problem! I'll try to come up with a solution as soon as I can.

ahamez avatar Jan 28 '23 21:01 ahamez

I think I've fixed this. Could you try with the branch nested_extensions to see if it works with your use case?

ahamez avatar Jan 29 '23 22:01 ahamez

Yes, it seems to be working correctly!

The only issue i have found is it currently (identically) defines encode_ext/2, default(:ext), field_def("ext") and field_def(:ext) on the base message once for each extension, and duplicates the call to encode_ext(msg) for each extension.

skwerlman avatar Jan 30 '23 22:01 skwerlman

OK, thanks for the feedback! If it works as is, I suggest you keep using this branch while I come up with a better solution. It might involve some API breaking, so I want to make sure I've got everything right.

ahamez avatar Feb 01 '23 19:02 ahamez

I've got a working solution for the decoding part (I still need to figure how to handle the encoding part):

iex(1)> defmodule MyModule do
...(1)>   use Protox,
...(1)>     schema: """
...(1)>     syntax = "proto2";
...(1)>
...(1)>     message SessionCommand {
...(1)>          enum SessionCommandType {
...(1)>              PING = 1000;
...(1)>          }
...(1)>          extensions 100 to max;
...(1)>      }
...(1)>
...(1)>     message Command_Ping {
...(1)>         extend SessionCommand {
...(1)>             optional Command_Ping ext = 1000;
...(1)>         }
...(1)>     }
...(1)>     """
...(1)> end
...
iex(2)>  SessionCommand.decode!(<<194, 62, 0>>)
%SessionCommand{__uf__: [], ext: {Command_Ping, %Command_Ping{__uf__: []}}}
iex(3)>

Note that now ext is set to the tuple {Command_Ping, %Command_Ping{...}}. Even though it may seem redundant, it's required as nested extensions can declare fields with the same type, thus we need a way to disambiguate from which extension the field comes from. For instance, we can have the following proto schema:

syntax = "proto2";
message Extendee {
  extensions 100 to max;
}
message Extension1 {
  extend Extendee {
    optional int32 ext = 101;
  }
}
message Extension2 {
  extend Extendee {
    optional int32 ext = 102;
  }
}

So, to sum up: use the first field of the tuple to get the type of the extension field :-).

I'll address the encoding part as soon as possible!

ahamez avatar Feb 05 '23 20:02 ahamez

hi! i've come back to working on the project where i ran into this; has there been any movement on fixing this in a release?

skwerlman avatar Jan 31 '25 18:01 skwerlman

Hi, I still haven't finished the encoding part (as it's not something I use myself, it's hard to find time for this). I'll try to bump the issue, but no promises 😬

ahamez avatar Feb 02 '25 21:02 ahamez

No worries! And thanks for the work you've done already!

skwerlman avatar Feb 04 '25 21:02 skwerlman

Hi, I finally managed to fix this issue 😅.

I had to change my initial approach, now fields coming from extensions are prefixed with the name of the extension.

So, using your example:

iex> defmodule Foo do
  use Protox,
    schema: """
    syntax = "proto2";
    message SessionCommand {
        enum SessionCommandType {
            PING = 1000;
            // etc
        }
        extensions 100 to max;
    }
    message Command_Ping {
        extend SessionCommand {
            optional Command_Ping ext = 1000;
        }
    }
"""
end

We can decode the command:

iex> SessionCommand.decode!(<<194, 62, 0>>)
%SessionCommand{command__ping_ext: %Command_Ping{__uf__: []}, __uf__: []}

And we can encode it back:

iex> SessionCommand.encode!(cmd)
{[<<194, 62>>, [<<0>>, []]], 3}

ahamez avatar Apr 15 '25 13:04 ahamez

It's now published in version 2.0.3.

ahamez avatar Apr 15 '25 13:04 ahamez