FSharp.GrpcCodeGenerator icon indicating copy to clipboard operation
FSharp.GrpcCodeGenerator copied to clipboard

Find a way to let the tool work without a global tool installation

Open Arshia001 opened this issue 2 years ago • 3 comments

I'd rather not have the users of this library need to install a global tool. For one thing, it's impossible to make a consistent version of the tool available on the machines of all developers of a project. Then there's the fact that global tool installation complicates project setup and Dockerfiles.

The global tool is needed because a locally installed tool cannot be run as a stand-alone executable, instead needing to be called via the dotnet command line, e.g. dotnet protoc-gen-fsharp instead of protoc-gen-fsharp.

One way to work around this limitation would be to include the plugin with the Tools package. We're already checking the user's OS and calling the correct version of protoc; if we had access to the plugin's executables for each platform, we could pass it in to the compiler with --plugin=protoc-gen-fsharp=....

To do this, we need to compile and publish the plugin for each of the 3 main OS's, and include the published binaries in the final Tools package. However, this is further complicated by the fact that we don't know the user's runtime environment in advance, so we can't decide on one framework to publish for, so we'd have to bundle for multiple runtimes on multiple OS's, which will grow the Tools package's size even further.

We could also forgo the platform-specific binaries, and go with shell scripts instead:

#! /usr/bin/env bash

dotnet_version=$(dotnet --version | extract-version-in-some-way)
dotnet ../$dotnet_version/FSharp.GrpcCodeGenerator.dll

However, this creates an additional dependency on the existence of specific shells on the user's system, which is undesirable.

Arshia001 avatar Jul 18 '22 06:07 Arshia001

Can we convince the Protobuf people to support arguments to the plugin binary specification? I know they push back hard against things that would require all plugins to implement, but this is something that's only specific to the protoc binary.

marner2 avatar Oct 07 '22 16:10 marner2

Highly doubtful. If they were going to implement this (or, more likely, take a PR for it), it wouldn't break other plugins as long as no one did anything too crazy. Still, I don't think they'd be too keen on accepting this change.

Arshia001 avatar Oct 07 '22 17:10 Arshia001

In principle, it should be possible to write a fake proxy plugin that can forward the request to another program with arbitrary arguments.

The protoc compiler communicates with the plugins exclusively through the STDIN and STDOUT and always invokes the plugin executable with no arguments. It expects the plugin to use a particular function PluginMain as its entry point (see reference guide).

Immediately after running, PluginMain opens STDIN in binary mode and decodes it into a CodeGeneratorRequest proto. The proto looks like this (after stripping comments):

message CodeGeneratorRequest {
  repeated string file_to_generate = 1;
  optional string parameter = 2;
  repeated FileDescriptorProto proto_file = 15;
  repeated FileDescriptorProto source_file_descriptors = 17;
  optional Version compiler_version = 3;
}

The parameters are passed in the parameter field. protoc will simply put there everything between = and : of your plugin --*_out option. For example:

protoc --plugin=protoc-gen-foo=foo.exe --foo_out=abcxyz:. ...

will have parameter equal to "abcxyz". My experiments show that you can put any text, including the =-character and spaces if you use quotes, but it has problem escaping the colon (:) itself. But that's a minor issue.

So the idea is that you can write a proxy plugin that decodes the request from STDIN, looks up the parameter field, and based on that, starts another program, writing an altered request to its STDIN. The altered request is identical, except for the parameter field.

We could write that plugin in C++, using the machinery of protobuf for decoding and encoding. But it looks like we can get away with having no dependencies. Fortunately, although not generally guaranteed by the protobuf wire format, protoc guarantees that the parameters will be serialized before all the beefy stuff (proto_file, etc...) (see comment). So in order to get to the parameters, we only have to parse the initial chunk of the stream, that involves only simple fields. Looking at the wire format description, this should be straightforward.

Another advantage of that dependency-less approach is that it will simply have overhead. We will only decode the initial bit of the request, and the bulk of it we can forward to the spawned process as-is.

I don't have time to do it now, but I'm planning to look into writing such proxy plugin for my project, because I have exactly the same issue as you.

Appendix

For experiments, I wrote a simple GO program to see what protoc is sending (I found it simpler to do than trying to deal with binary streams in a shell script). For your convenience, if you have go and protoc installed, run these commands:

echo 'message Bar {}' > bar.proto
echo -e \
  'package main\nimport("fmt"\n"io"\n"os")\nfunc main() {' \
    's, _ := io.ReadAll(os.Stdin)\n' \
    'fmt.Println(string(s))\n' \
  '}' > main.go
go mod init example.com/foo
go build -o foo
protoc --plugin=protoc-gen-foo=foo --foo_out=abcxyz:. -I . bar.proto

to get the following output:

--foo_out: protoc-gen-foo: Plugin output is unparseable: \n\tbar.proto\022\006abcxyz\032\010\010\004\020\031\030\001\"\000z2\n\tbar.proto\"\005\n\003BarJ\036\n\005\022\003\000\000\016\n\t\n\002\004\000\022\003\000\000\016\n\n\n\003\004\000\001\022\003\000\010\013\212\0012\n\tbar.proto\"\005\n\003BarJ\036\n\005\022\003\000\000\016\n\t\n\002\004\000\022\003\000\000\016\n\n\n\003\004\000\001\022\003\000\010\013\n

mkatch avatar Jan 06 '24 13:01 mkatch