capnproto Poor handling of "circular" dependencies in C++ codegen

The capnproto compiler itself has no problem handling circular dependencies. The C++ codegen has no problem as long as the schema is in a single file, and the types are written into the schema in a suitable order. It runs into problems with generating incorrect code when the schema is split across multiple files.

Consider the two schemas below. The generated C++ headers and sources are valid when the two schemas are concatenated into a single file, in the order given (of course without the second file id). Otherwise, due to circular dependencies between the resulting headers, the C++ headers are not valid C++ and fail to compile.

# base.capnp

interface Node {
  getInfo @0 () -> Info;
}

struct Info {
	node @0 :Node;
	classes @1 :Classes;
}

struct Classes {
	i1 @0 :import "i1.capnp".I1.Classes;
        i2 @1: Void;
        # many more classes similar to i1.Class are here
}

# i1.capnp

using Base = import "base.capnp";

interface I1 {
	interface Node extends (Base.Node) {}
	struct Classes {
		sub @0: Sub;
	}
	struct Sub {
		const prototype :Base.Classes = ( i1 = (sub = ()) );
    data @0: Bool;
	}
}

One solution would be to teach capnpc how to concatenate schemas, so that it would pass the whole thing to the codegen, and generate a single header file and a single source file. That seems to require the lowest effort, doesn't affect existing output, and doesn't require touching the codegens.

Another solution would be to have the C++ codegen cut up the headers into dependency-interlocked sections, included from the headers currently generated.

The basic dependency is between the C++ header sections (currently delineated by / ==== comments). Let's name them (those are really arbitrary names):

top-declarations
implementation-declarations (must come all after top-declarations)
inline-definitions (must come all after implementation-declarations)

The types within each section are mostly independent - with at least following caveats:

Within top-declarations, the constant declarations need to be sequenced after the types they refer to.
Within implementation-declarations, the capabilities need to be sequenced after the capabilities they extend (if any).

Another observation is that the codegen emits types in the order they appeared in the schema. Thus, even though the schema itself doesn't care about relative ordering, the codegen's output will :/

The front-end could generate a high-level dependency graph for types, adding it to the binary schema, and the C++ codegen would pick it up and classify the dependencies by the sections in which they matter. Then the dependency would be used to order the chunks within each section. This could result in generation of more than one sub-header file per section. Cycles in dependencies within any section should be detected and abort the codegen with an error.

Below is an example of one way of capturing the constraints properly. All the headers of course have #pragma once, omitted here for clarity. Note that by default, the section's includes would come before the section's contents, just like the current codegen does - except it doesn't splice the files by sections. Only when a dependency forces it, the include is moved after the section.

// base.capnp.h
// This is still the file that the user would include
#include "base_1.capnp.h"
#include "base_2.capnp.h"
#include "base_3.capnp.h"

// base_1.capnp.h
/* here go our top-declarations */
#include "i1_1.capnp.h"
/* base includes i1, but it must be sequenced after base since i1_1 constants depends on base1_1 */

// base_2.capnp.h
/* here go our implementation-declarations */
#include "i1_2.capnp.h"
/* base includes i1, but it must be sequenced after base since i1_2 capabilities depend on base1_2 */

// base_3.capnp.h
#include "i1_3.capnp.h" /* default sequencing */
/* here go our inline-definitions */

// i1.capnp.h
// This is the file that the user would include
#include "i1_1.capnp.h"
#include "i1_2.capnp.h"
#include "i1_3.capnp.h"

// i1_1.capnp.h
#include "base_1.capnp.h" /* default sequencing */
/* here go our top-declarations */

// i1_2.capnp.h
#include "base_2.capnp.h" /* default sequencing */
/* here go our implementation-declarations */

// i1_3.capnp.h
#include "base_3.capnp.h" /* default sequencing */
/* here go our inline-definitions */

The base.capnp.h and i1.capnp.h can be included in any relative order, and will compile correctly.

Jun 26 '19 19:06 KubaO

Hi @KubaO,

You've been filing a lot of issues lately. Are these issues that have actually caused trouble for you, or are they just things you noticed or thought of that you think could be made better?

If they are issues that you are actually having problems with in your own projects, then I'd love to hear specifics of your use case to see how best to solve it.

If these are just speculative, then, sorry, but I'd prefer that you not file them. The reality is that, like every large software project, there are many imperfections in Cap'n Proto; however, we will always have more-important things to do than to address issues that aren't affecting anyone. Engineering time is precious and there's an infinite amount of value to be built. If we spend time being perfectionists, we'll end up producing less actual value and will lose the race against competitors who focus on value.

The specific issue described here is one which would be extremely complicated -- maybe impossible -- to fully solve. However, in practice, cyclic dependencies are bad design, and so developers already avoid them. Because of that, I've never actually seen this problem come up in practice.

Jun 27 '19 11:06 kentonv

These are actual issues I'm running into into my job, I'm always posting minimal examples that focus on the reproducer rather than very big picture, and I try to show some concrete examples, since it's easy to suggest unworkable ideas otherwise. This is a tradeoff: I'm new at this, and sometimes a "big picture" solution may turn out better. It's all too easy to have XY problems, for sure.

Just to be clear: I file every issue with the intent of doing the work myself. I'm not filing any of these for someone else to fix. I do want to know whether fixing them would be worthwhile, or whether I miss something important, and then work on a PR should it be sensible, and I'm more than willing to hack at it until it passes muster or is declared a dead-end.

I have since prototyped the dependency generator and output sequencer that solve this issue or report an error if no solution exists, so from my end I consider it "fixable" but wanted to hear your thoughts. This is a reasonably easy problem to solve: generating a dependency graph is not black magic, and the output from the graph analysis is a solid yes-or-no as to whether a traversal that de-cycles the output is possible.

This analysis is only used to resequence the codegen output, only to the extent necessary, starting with the default sequencing: for any code that already compiles, the dependency analysis' effect should be nil: the output doesn't change (anything else is a bug that I would need to fix). I usually limit myself to solutions that only change behavior when it's necessary to fix some perceived shortcoming, and otherwise preserve existing behavior: this is a production code base and gratuitous changes are the last thing anyone needs.

The key ingredient to this de-cycling is the clever split of generated C++ sources into three sections, so you pretty much did 95% of the job, I think. The cycles that cross the section boundaries are OK; ones that are within a section usually error out - so there's still some work I have to do to ensure that the front-end could notice that without tying it to the structure of generated C++. I picked up the section-split for the other codegens I wrote, it is surprisingly portable to non-C++ languages thus far.

The example is an excerpt from a large, albeit young, project, with lots of schemas (thousands of lines sans comments). The cyclic dependencies are not something that can be easily avoided. In this particular example, the Info object can "derive" from multiple classes, that are arranged in a tree hierarchy. Each of those classes has a prototype that includes other necessary classes it should be used with. The classes have prototype objects that have the needed subclasses instantiated (they go together). This pattern seems to recur in multiple areas of my schemas; getting rid of it would remove the static type safety and I don't want that at all. The only way I see of breaking the cyclic dependency is to have the prototypes in a separate schema - and that only solves this instance of the pattern. Other ones I haven't got any workaround ideas about yet. I generally dislike limitations that don't let me write the schema in a way that makes it easier to comprehend from the application standpoint.

If this pattern would be supported (i.e. the codegens could cope with it), then it could be a means of implementing virtual inheritance for structs, but that's a separate matter entirely.

Having to reorganize the schema to work around codegen limitations is potentially crippling: it doesn't scale when you use more than one codegen. All of the schemas I write end up generating C code, C++ code, IEC61131-3 code, and C# code. The C and IEC codegens are my own work so I can make sure that they doen't produce invalid code no matter how the schemas are set up, but the C++ and C# codegens aren't mine, and I do need them not to force me to work around their limitations.

I wish that if capnpc -ocapnp succeeds on an annotation-free schema, then the backends are supposed to take it. This excepts language-specific fallout from annotations: errors in those can only be caught in the backend. Maybe it's an unreasonable goal, but I don't see why not at least attempt to make it work - especially that I'm the one doing the attempting. I only ask that you idiot-check the issue (I'm liable to be an idiot as much as anyone), and indicate the willingness to consider a PR done to this project's standards - and even then I have no expectation that it will actually get merged, there may be problems that make it unfeasible that I haven't run into yet.

Finally, even if an issue ends up being closed, it does preserve the idea, and I assume that at least some of my thinking has some value higher than 0. Even if it's all totally silly, then it could serve some entertainment purpose at the very least :)

Jun 27 '19 14:06 KubaO

OK, sorry for doubting. I've had a lot of people in the past nitpicking cases where generated code doesn't compile (both for Cap'n Proto and Protobuf) and in most cases they were doing it only hypothetically, because it "felt wrong" to them that they could cause broken generated code, even though they hadn't actually hit the issue in practice. Meanwhile I don't think it's realistic to try to create a code generator that doesn't have problems like that -- I think you could spent a near-infinite amount of time chasing down obscure conflicts. So it is a bit of a sore spot.

Please keep in mind that even if you are offering to write code to fix these problems yourself, that does still create work for me to review your design and code (and, in some cases, implement a different fix). However, I'm happy to do that to fix real problems.

With regards to this problem, I don't think you actually need to break the generated code apart into per-type files nor perform a topographic sort. If that were necessary, then it would be necessary even within a single file today -- but the current code generator supports cyclic dependencies between types in one file just fine.

The trick here is that we need to interleave the sections of the header files. There are essentially three sections (I think you were getting at this in your comment, but I didn't totally follow what you were saying):

Forward declarations of types and schemas.
Definitions of Readers and Builders.
Inline method definitions.

Declarations within each section don't depend on each other, they only depend on declarations from a previous section. This is why cycles cause no problem within a file.

But between files, we have a problem: All the #include lines for dependencies are at the very top of the header, and so all three sections of the included headers are parsed before the includer. Cycles then cause issues.

What we want is for the file structure to look like:

Forward declarations of types and schemas. a. #include forward declarations from dependencies.
Definitions of Readers and Builders. a. #include definitions of readers and builders from dependencies.
Inline method definitions.

This, however, seems to require that we split every header file into at least two headers. We need foo.capnp.h and foo.capnp.fwd.h. (I think we can get away with combining sections 2 and 3 into a single file, it just has to include its dependencies in the middle of the file, which is unusual but totally works.)

These .capnp.fwd.h files could actually be a feature in themselves -- it finally means that you can get forward declarations of capnp types without including the whole header.

There's a huge problem, though: If we just start generating .fwd.h files by default, we'll break lots of builds. Many build systems need to be aware of the exact outputs of a build step, especially if they perform the step in a sandbox or remotely, or if they eventually need to include the headers in a package or installation.

So it seems like this needs to be an option. The option probably needs to be controlled by a file-level annotation -- it can't be a compiler flag, because dependents provided by third-party libraries might use different options, and the code generator needs to know which options the imported files used.

Jun 28 '19 12:06 kentonv

You got what I was saying perfectly, even if I said it imperfectly. I do understand that the review work is not free either, as well. I'm sure I'll get better at figuring out what's noise and what's not as far as capnp project is concerned - it may take a bit of time, though.

I didn't think that additional build outputs may break builds, but that makes perfect sense.

Here's what one could do, to keep it in one file (I'm doing a variant of it in my prototype):

// foo.capnp.h
// no #pragma once - this file will has to be included at least twice, recursively

#ifndef CAPNP_FOO__1
#define CAPNP_FOO__1
/* section 1 includes go here? */
/* section 1 goes here */
#endif // CAPNP_FOO__1

#ifndef CAPNP_FOO__1_2
#define CAPNP_FOO__1_2
#define CAPNP_FORWARD_ONLY
/* includes go here - they may include this file again, so further recursion is terminated */
#undef CAPNP_FORWARD_ONLY
#endif // CAPNP_FOO__1_2

#ifndef CAPNP_FORWARD_ONLY
#ifndef CAPNP_FOO__23
#define CAPNP_FOO__23
/* section 2 goes here */
/* section 3 goes here */
#endif // CAPNP_FOO__23
#endif // CAPNP_FORWARD_ONLY

The names of the defines probably have to be cleverer to avoid collision. Maybe use file id instead of FOO?

Alas, the other issue is that schema files are designed to allow forward references. Yet at least the C++ codegen is emitting the types in the order they are present in the schema, without reordering based on dependencies. Thus the following two cases create invalid C++ output. Does this fall under the umbrella of the same issue, or is it a separate bug, or is it by design? If it's by design, then I don't understand why the schema language would allow such forward references...

# Case 1

const prototype :Case1Root = ();
struct Case1Root{}

# Case 2

interface Case2 extends (Case2Root) {}
interface Case2Root {}

In either case, swapping the places of the two types fixes the problem.

Jun 28 '19 15:06 KubaO

I suppose the #ifdef tricks you suggest could work, but it's a bit sad that it means we can't use #pragma once (nor depend on the preprocessor's equivalent optimization) anymore. So these headers might then end up being read and preprocessed several times in a single translation unit.

I suppose you could specifically detect when cycles are present and only fall back to the extra #ifdefs in that case. But I do kind of like .fwd.h as a publicly-visible feature, too, not just a solution to the cycles problem.

The two cases you mention of invalid C++ references within a single file are different from the inter-file cycles issue. In both these cases, the code that appears in section 2 of the header depends on other section 2 declarations. Unfortunately it seems easy to construct a case that is logically valid, but impossible to represent in C++:

struct A {
  struct AA {}
  const b :B.BB = ();
}
struct B {
  struct BB {}
  const a :A.AA = ();
}

I think this case could be solved if C++ allowed forward declarations of nested types. However, because it does not, there is no way to generate valid C++ code for this case -- even though, semantically, there is no cycle in the dependencies.

When it comes to interface inheritance, we have the same issue as with constants, and additionally the issue that a class cannot inherit from an incomplete type. The latter problem does seem to require a topological sort of interface types.

Jun 29 '19 09:06 kentonv

Kenton suggested at some point to declare the nested types at namespaces level with _ . It would make above example possible.

struct A;
struct A_AA;
struct B;
struct B_BB;

struct A_AA {
};
struct B_BB {
};
struct A {
  using AB = ::A_AA;
  const b ::B_BB  = {}
};
struct B {
  using BB = ::B_BB;
  const a ::A_AA  = {}
};

Jul 01 '19 07:07 katreniak

@katreniak True, that would then allow us to topologically sort everything. Protobuf also takes this approach, so there's precedent.

There are some minor issues this creates. All of the symbol names will change -- but we don't claim ABI compatibility between versions anyway. The _ name will show up in error messages, stack traces, etc. In debuggers you might be forced to specify symbols by the _ name rather than their natural name. But these are minor.

Some people will definitely depend on the _ names, as it lets them forward-declare types. So it won't be possible to go back later.

I think the main reason I've been resisting doing this is just aesthetics... it feels wrong. But that's probably silly.

Jul 01 '19 09:07 kentonv

I think I ran across this today trying to split up a 3-way handshake that exchanges the interfaces peers should use (or maybe I'm just structuring the capnproto incorrectly)?

client.capnp:

using Server = import "server.capnp";

struct Error {
}

interface Available {
   ok @0 (conn: Server.Connecting) -> ();
   err @1 (error: Server.Error) -> ();
}

interface Connected {
}

server.capnp:

using Client = import "server.capnp";

struct Error {
}

interface Available {
  init @0 (from: Client.Available) -> ()
}

interface Connecting {
  ack @0 (from: Client.Connected) -> (connection: Connected);
  err @1 (error: Client.Error) -> ()
}

Ideally the client implementation imports the strict minimum of server interface it needs & vice-versa.

The client initiates the connection but at the end of the handshake each peer has the other's interface because each needs to initiate RPC calls on the other (e.g. server pro-actively sends information to client, client needs to make requests of the server). On the other hand, perfectly willing to accept that this isn't really the best way to structure it in cap'n'proto land.

Jun 28 '20 17:06 vlovich