ProtoBuf.jl icon indicating copy to clipboard operation
ProtoBuf.jl copied to clipboard

Add gRPC Client Code Generation Support

Open csvance opened this issue 6 months ago • 15 comments

Hi, I have been working on adding code generation for gRPC service clients such as my new improved client that works with ProtoBuf 1.X branch https://github.com/csvance/gRPCClient2.jl. This pull request does two things:

  1. FIX: RPC parsing has a bug where the request type is parsed as the response type
  2. ADD: Implement callback registration for codegen(io, t::ServiceType, ::Context). This allows for any number of external gRPC client/service providers to hook into the codegen process for services.
using gRPCClient
import ProtoBuf.CodeGenerators


# (included in gRPCClient.jl) Register our service codegen with ProtoBuf.jl
service_cb(io, t::CodeGenerators.ServiceType, ctx::CodeGenerators.Context) = ...
import_cb(io, ctx) = println(io, "import gRPCClient")

grpc_register_service_codegen() = CodeGenerators.register_external_codegen_handler(
    "gRPCClient.jl";
    import_cb = import_cb,
    service_cb = service_cb,
)

grpc_register_service_codegen() 

# Creates Julia bindings for the messages and RPC defined in test.proto
protojl("test/proto/test.proto", ".", "test/gen")

csvance avatar Oct 15 '25 14:10 csvance

I reworked this to use a registration system. So now any gRPC client/server library can hook into the codegen and generate the required service stubs without needing to worry about who gets to define to codegen(io, service::ServiceType, ::Context).

@Drvi does this approach look ok to you?

csvance avatar Oct 20 '25 12:10 csvance

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 93.94%. Comparing base (129547a) to head (5e8fd92). :warning: Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #283      +/-   ##
==========================================
+ Coverage   93.91%   93.94%   +0.03%     
==========================================
  Files          25       25              
  Lines        3022     3040      +18     
==========================================
+ Hits         2838     2856      +18     
  Misses        184      184              

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Oct 20 '25 12:10 codecov[bot]

It would be great if someone could look at this, trying to get the package registered but its likely going to be a roadblock needing a patch to ProtoBuf.jl for code generation.

csvance avatar Nov 14 '25 15:11 csvance

Hey @Drvi, we just merged gRPCClient 1.0.0 and moved it to JuliaIO https://github.com/JuliaIO/gRPCClient.jl

Codegen is the last outstanding issue we need to resolve before we can register the package.

csvance avatar Nov 24 '25 12:11 csvance

Hey @tanmaykm, I'm going to work on getting this in a better state with test coverage and documentation this weekend. As far as actually getting this merged and the release tagged, would that be something I'm allowed to do as part of JuliaIO now? I don't want to overstep my bounds, but I would really like to get gRPCClient.jl out to everyone now that its ready for a 1.0.0 release.

csvance avatar Dec 05 '25 16:12 csvance

Hi @csvance, this looks good to me... I think we should be able to get this merged and released once the reviews and CI are taken care of.

tanmaykm avatar Dec 06 '25 09:12 tanmaykm

I tried using this on a protobuf package. Since this creates code wrapped inside of a module, it would be necessary to run import gRPCClient as well. Adding this to the codegen in gRPCClient.jl works, but it is not very pleasing that the import will only be placed next to the RPC code, which might be after all types are generated (i.e. not at top of the file). If this is something that should be improved, maybe this API needs to be reconsidered, so that the gRPCClient codegen can be asked to generate import statements and RPCs at different places.

johroj avatar Dec 06 '25 10:12 johroj

Since this creates code wrapped inside of a module, it would be necessary to run import gRPCClient as well.

Doesn't it generate two files, one with and without a module? Technically you could just import the version that isn't wrapped in the model, but I guess that may result in namespace polution or things being exported that you didn't wish to export.

test.jl

module test

include("test_pb.jl")

end # module test

test_pb.jl

# Autogenerated using ProtoBuf.jl v1.2.0 on 2025-10-20T07:36:28.709
# original file: /home/[email protected]/PycharmProjects/gRPCClient2.jl/test/proto/test.proto (proto3 syntax)

import ProtoBuf as PB
using ProtoBuf: OneOf
using ProtoBuf.EnumX: @enumx

export TestResponse, TestRequest

...

I will say that my typical workflows with ProtoBuf are relatively simple. The most complicated thing I deal with is here: https://github.com/triton-inference-server/common/tree/main/protobuf. Also my knowledge of how modules work in Julia is mostly limited to "this goes at the top level of the package" because when things get to the point of needing to split into multiple modules, I usually start to split into packages instead. More or less this is a long winded way of saying "If I don't understand the problem I cannot understand the solution".

Say we updated the API to also add an import statement, would that be enough? Something like:

register_service_codegen(package_name::String, codegen_callback::Function) = ...

Where should the import statement go then? Should it be right after this:

import ProtoBuf as PB
using ProtoBuf: OneOf
using ProtoBuf.EnumX: @enumx

import gRPCClient

Then the updated codegen should include the namespace for the client constructors: gRPCClient.gRPCServiceClient{TestRequest, true, TestResponse, true}

csvance avatar Dec 06 '25 15:12 csvance

Doesn't it generate two files, one with and without a module? Technically you could just import the version that isn't wrapped in the model, but I guess that may result in namespace polution or things being exported that you didn't wish to export.

I probably just encountered this issue due to a somewhat complex protobuf package with 20 files all referencing each other. This generates 20 thingy_pb.jl files and one package.jl file looking something like

module package

include("thingy12_pb.jl")
include("thingy3_pb.jl")
include("thingy11_pb.jl")
...

end

ProtoBuf.jl sorts out the order in which these files need to be included. If I would just include all files manually in descending order, I would get an error because thingy3_pb.jl uses some type defined in thingy12_pb.jl. So it's quite conveinent to use this module file and ProtoBuf.jl seems to handle this really well. But it only works if using gRPCClient is called somewhere within. As i mentioned previously, this can already be solved by running println(io, "using gRPCClient") in the current codegen, but that import statement might end up in the middle of a file. So we get

import ProtoBuf as PB
using ProtoBuf: OneOf
using ProtoBuf.EnumX: @enumx

struct A

function PB.encode(...)

import gRPCClient

function TestService_TestRPC_Client(...)

whereas the better option would be

import ProtoBuf as PB
import gRPCClient # only in files containing RPCs
using ProtoBuf: OneOf
using ProtoBuf.EnumX: @enumx

struct A

function PB.encode(...)

function TestService_TestRPC_Client(...)

I guess this is fine for a first release, as long as the API does not prevent improving it later on. The thing that looks like a potential issue is that we just register one function, and your proposal was to add a second registered item. What about if yet another functionality becomes necessary? I think it would be nice to have a solution that does not require knowing all possible pieces of code we want gRPCClient to generate in the future.

Do you think it would make more sense to maybe handle this by dispatch on the second argument to service_client_codegen_handler? Something like

function service_client_codegen_handler(io, t::ServiceType, ctx)
    # Current implementation
end

# To be added and called somewere appropriate in `ProtoBuf.jl` in the future
function service_client_codegen_handler(io, t::ImportStatement, ctx)
    println(io, "using gRPCClient")
end

This would mean that the registered function service_client_codegen_handler allows future extension to generate different types of code. In that case, the issue I raised is not an issue at all does not prevent merging & making a release.

johroj avatar Dec 06 '25 22:12 johroj

It's relatively simple to put theusing gRPCClient in the right place afaik, but we do have to tell ProtoBuf.jl what to include. In the future we may also register a gRPCServer. We want to leave it open so people are able to integrate with the package without having to upstream code.

Maybe we could make the codegen handler registration be on the basis of a struct so we could add new fields in the future relying on kwargs? So starting with something that includes the module name and the callbacks.

csvance avatar Dec 06 '25 22:12 csvance

Maybe we could make the codegen handler registration be on the basis of a struct so we could add new fields in the future relying on kwargs? So starting with something that includes the module name and the callbacks.

This sounds like a good option that would also be simple to implement at the time. A struct could also be used with multiple dispatch to extend the interface. Something like

# Registration:
register_service_codegen(gRPCClientServiceCodeGenerator())

# Implementations:
service_codegen_handler(::gRPCClientServiceCodeGenerator, io, t::ServiceType, ctx)
    # Current implementation
end

service_codegen_import_statement(::gRPCClientServiceCodeGenerator, io) = println(io, "using gRPCClient")

can_generate_clientcode(::gRPCClientServiceCodeGenerator) = true
can_generate_servercode(::gRPCClientServiceCodeGenerator) = false

johroj avatar Dec 07 '25 09:12 johroj

@johroj I think it might be easy to get carried away with over-engineering this. I tried as best I could to define an interface that isn't cluttered/overly specific but can be easily extended in the future.

Here is how it looks interfacing with it on the gRPCClient.jl side:

function service_cb(io, t::CodeGenerators.ServiceType, ctx::CodeGenerators.Context)
    namespace = join(ctx.proto_file.preamble.namespace, ".")
    service_name = t.name


    for (i, rpc) in enumerate(t.rpcs)
        rpc_path = "/$namespace.$service_name/$(rpc.name)"

        request_type = rpc.request_type.name
        response_type = rpc.response_type.name

        if rpc.request_type.package_namespace !== nothing
            request_type = join([rpc.package_namespace, request_type], ".")
        end
        if rpc.response_type.package_namespace !== nothing
            response_type = join([rpc.package_namespace, response_type], ".")
        end

        println(io, "$(service_name)_$(rpc.name)_Client(")
        println(io, "\thost, port;")
        println(io, "\tsecure=false,")
        println(io, "\tgrpc=gRPCClient.grpc_global_handle(),")
        println(io, "\tdeadline=10,")
        println(io, "\tkeepalive=60,")
        println(io, "\tmax_send_message_length = 4*1024*1024,")
        println(io, "\tmax_recieve_message_length = 4*1024*1024,")
        println(
            io,
            ") = gRPCClient.gRPCServiceClient{$request_type, $(rpc.request_stream), $response_type, $(rpc.response_stream)}(",
        )
        println(io, "\thost, port, \"$rpc_path\";")
        println(io, "\tsecure=secure,")
        println(io, "\tgrpc=grpc,")
        println(io, "\tdeadline=deadline,")
        println(io, "\tkeepalive=keepalive,")
        println(io, "\tmax_send_message_length=max_send_message_length,")
        println(io, "\tmax_recieve_message_length=max_recieve_message_length,")
        println(io, i < length(t.rpcs) ? ")\n" : ")" )
    end
end

import_cb(io, ctx) = println(io, "import gRPCClient")

grpc_register_service_codegen() = 
    CodeGenerators.register_external_codegen_handler(
        "gRPCClient.jl"; 
        import_cb=import_cb, service_cb=service_cb
    )

csvance avatar Dec 07 '25 16:12 csvance

@tanmaykm currently on 1.12 and nightly some of the tests are failing due to:

  1. https://github.com/JuliaLang/julia/issues/58780 behavior of @allocated seems to have changed breaking two tests
  2. JET says optimization failed due to recursion in _encoded_size

Both of this are failing in master and are not caused by my changes, so I think its outside of the scope of this pull request to fix them. It's not clear to me how I could address either one of them without removing the tests altogether, as it seems like its a rather contentious issue to rely on @allocated for tests and I have no clue where to even start with the second.

Beyond that I added / updated the code generation tests for services to fix the issues with coverage. It still says I am missing a line somewhere but it doesn't give me any pointers of where to look.

Once I get feedback about my updated code generation approach we should be ready to merge.

csvance avatar Dec 07 '25 18:12 csvance

@johroj allowing for imports to be conditional on the types being generated (ie only trigger import gRPCClient if there is >= 1 ServiceType) is now implemented. It's done in a way that is left up to the external code generator to handle the specifics of, so ProtoBuf.jl will not need future changes. Exports being directly after clients can be handled in gRPCClient.jl, and is now part of a pull request there.

@tanmaykm think we are ready to merge.

csvance avatar Dec 08 '25 13:12 csvance

This should improve the CI situation: https://github.com/JuliaIO/ProtoBuf.jl/pull/287

tanmaykm avatar Dec 11 '25 19:12 tanmaykm

https://github.com/JuliaIO/ProtoBuf.jl/pull/287 is merged now. Would be good to see the CI pass here before merging. @csvance could you please rebase the branch with upstream master once?

tanmaykm avatar Dec 17 '25 00:12 tanmaykm

@tanmaykm I rebased + cleaned up commit history and ran the gRPCClient.jl test suite after re-running code generation. Looks good to me.

csvance avatar Dec 17 '25 01:12 csvance

Thanks!

tanmaykm avatar Dec 17 '25 01:12 tanmaykm