c-for-go icon indicating copy to clipboard operation
c-for-go copied to clipboard

[question] Is it possible to generate only types, but not the functions?

Open Asalle opened this issue 5 years ago • 9 comments

I'm writing bindings for https://github.com/roc-project/roc and as it is an audio streaming library, it's very sensitive when it comes to memory allocations, because we need to be as quick as possible. We really like how c-for-go generates go types from c structs, but the way the functions are generated are not very suitable for our project.

Is there a way to generate types but avoid everything else?

Asalle avatar Jan 12 '20 16:01 Asalle

but the way the functions are generated are not very suitable

Can you please clarify this? I assume that you're talking about Deref() and Ref() methods that make full shallow copy of the struct, i.e. without data slices. This should be very effective. Keep in mind that those functions are used in vulkan-go for realtime graphics rendering and other audio decoding applications like vorbis-go

Looks like premature optimisation to me, judging the code by the way it looks, not by its performance metrics.

xlab avatar Jan 13 '20 08:01 xlab

Anyway, types.go is a standalone output file, you can generate all stuff and take only that file.

xlab avatar Jan 13 '20 08:01 xlab

@xlab you might want to take a look at this thread: https://www.freelists.org/post/roc/Problems-with-the-generated-code-in-Go-bindings

(thread index is available here: https://www.freelists.org/archive/roc/12-2019?threads=1)

gavv avatar Jan 13 '20 08:01 gavv

This is not optimal. But what is really bad, the memory returned by C.calloc is not GC-collectable. So we will have a memory leak (on C heap) unless the user will not manually call ReceiverConfig.Free.

True, however that ReceiverConfig.Free can be bound to Go's GC using runtime.SetFinalizer to avoid leaks, it's just not done automatically to avoid double-free in high-level objects. Moreover, in many implicit cases, for example in array packing (field mapping [][]float64 to **C.double) it will assign finalizers internally in the generated code, since it allocates but not as part of the API.

You can avoid automatic allocation of certain objects by allocating them manually or even using custom allocator in C, then just assign the reference to the generated wrapper, so at least Deref() would be handy. I bet 99% of allocs don't require to be very optimal.

it would be enough to alloc C.roc_receiver_config on Go heap

This is not true for a very long time now. There was a period around Go 1.6 where I did exactly this — just allocate memory in Go and pass it around, but latest changes within Go runtime now require to allocate C memory in C land, that's why I had to implement those allocation tracking maps with explicit Free and optional runtime.SetFinalizer pattern.

Finally, the generated code is not fully thread-safe

Well, this is true for the C API then, because the generated code is stateless.

To sum up, all problems mentioned in the list (except private/public which is subject to a proper configuration of the generator) can be resolved by adding an additional layer that will move the API to a higher level. There is no problem in generator, those are just vanilla C problems that are not very suitable for Go.

xlab avatar Jan 13 '20 09:01 xlab

Thanks for taking time for this.

True, however that ReceiverConfig.Free can be bound to Go's GC using runtime.SetFinalizer to avoid leaks, it's just not done automatically to avoid double-free in high-level objects.

Agree, but this would complicate the Go API for the user because we'll need to add a constructor for each struct and to forbid creating those structs directly. This is perfectly OK for opaque objects like Receiver, but not very convenient for "parameters" structs like ReceiverConfig. So instead of writing:

receiver := NewReceiver(ReceiverConfig{Foo: 123})

the user will have to write something like:

config := NewReceiverConfig() // calls SetFinalizer internally
config.Foo = 123
receiver := NewReceiver(config)

This is not true for a very long time now. There was a period around Go 1.6 where I did exactly this — just allocate memory in Go and pass it around, but latest changes within Go runtime now require to allocate C memory in C land, that's why I had to implement those allocation tracking maps with explicit Free and optional runtime.SetFinalizer pattern.

Could you clarify? We have a guarantee that the structs passed by pointer to libroc functions like roc_receiver_open() never use the passed struct after the function returns (don't store the pointers). Our API was carefully designed with this contract in mind. In this case, why can't we just cast a pointer to Go heap to unsafe pointer and pass it to C? The same approach is used in https://github.com/hraban/opus, for example. Is it something wrong with it?

Well, this is true for the C API then, because the generated code is stateless.

Are you sure that it's fully stateless?

I think this code is not thread-safe: https://github.com/roc-project/roc-go/blob/ce27f4432b36fbbb926274617b9828e330518f1a/roc/cgo_helpers.go#L678-L686

Because it modifies a global var logHandler9E4BBC5EFunc without a lock. So you can't call logSetHandler() concurrently.

However the underlying function roc_log_handler() is thread-safe.

To sum up, all problems mentioned in the list (except private/public which is subject to a proper configuration of the generator) can be resolved by adding an additional layer that will move the API to a higher level.

I agree that we can solve these problems by adding a higher-level layer. As I said, that solution wont be optimal, but that's not very critical. Actually performance is not the main concern here.

However, since I'd prefer to avoid exporting Free(), PassRef(), etc, to create such a layer, we'll actually have to duplicate all generated "parameter" structs (like ReceiverConfig) and maintain field-by-field copying from public structs to private generated structs. But it makes little sense since in this case, we can just directly copy to C structs and don't use generated structs at all. The work will be the same and it would also be a bit more efficient.

And if we could disable generating memory management code (PassRef, Free, memory tracking maps), we could simplify our life and use generated structs directly, without need to maintaining field-by-filed copying by hand.

I see it as a special "lite" generator mode for the case when the memory management code is not really needed. In our case it's not needed because we, as the library developers, guarantee that C functions don't take or share the ownership of the passed objects.

Though, I don't know whether such a special mode is in scope of this project (c-for-go).

gavv avatar Jan 13 '20 09:01 gavv

  1. For the future reference if anyone finds this issue, please read https://golang.org/cmd/cgo/#hdr-Passing_pointers

But in your case since you never use the passed struct after the function returns (don't store the pointers) that's fine indeed, so you don't need to alloc on C side.

  1. These are for the callbacks, which are very tricky to use properly on their own and it sometimes might be just buggy, thread issues aside. Such types can be ignored

  2. Though, I don't know whether such a special mode is in scope of this project (c-for-go).

Well, the easiest way is to just take types.go from all generated files. Because all those memory management methods and helpers are in other files, some Make tool might discard them after generation.

xlab avatar Jan 13 '20 11:01 xlab

But in your case since you never use the passed struct after the function returns (don't store the pointers) that's fine indeed, so you don't need to alloc on C side.

Exactly.

gavv avatar Jan 13 '20 11:01 gavv

Well, the easiest way is to just take types.go from all generated files. Because all those memory management methods and helpers are in other files, some Make tool might discard them after generation.

Yep. But I was talking about partially using what is generated currently, and we can't solve it by a simple makefile rule.

type ReceiverConfig struct {
	FrameSampleRate         uint32
	FrameChannels           ChannelSet
	FrameEncoding           FrameEncoding
	AutomaticTiming         uint32
	ResamplerProfile        ResamplerProfile
	TargetLatency           uint64
	MaxLatencyOverrun       uint64
	MaxLatencyUnderrun      uint64
	NoPlaybackTimeout       int64
	BrokenPlaybackTimeout   int64
	BreakageDetectionWindow uint64
	ref9cfc9e50             *C.roc_receiver_config
	allocs9cfc9e50          interface{}
}

In our case, it would be nice to keep this struct auto-generated, but:

  • don't add ref9cfc9e50 and allocs9cfc9e50
  • don't generate Free/Ref/Deref
  • generate a simpler implementation of PassRef/PassValue that allocates roc_receiver_config on Go heap, do a field-by-filed copying from Go struct to C struct, and returns the C struct
  • allow to make those helpers unexported

Such a mode could be enabled on per-struct basic. It could be called "plain" or "unmanaged" or something... If it would be possible, we could keep using c-for-go for generating all these *config stuff. Otherwise we will have to maintain them by hand.

I'm just sharing my idea. I'm not sure whether this is within the scope of c-for-go. I understand that probably this is a rare case, though I saw similar approach it some hand-written bindings (e.g. opus and speexdsp bindings).

gavv avatar Jan 13 '20 12:01 gavv

I think this is a good idea. But actually not quick one to implement, since it will require an alternative generator module, i.e. not a simple template tweak. I'll put that as some future goal ;)

xlab avatar Jan 13 '20 13:01 xlab