design icon indicating copy to clipboard operation
design copied to clipboard

Custom annotations on functions

Open kripken opened this issue 7 months ago • 17 comments

I understand how custom annotations work on instructions, but I am confused about functions, and I see multiple options in the ecosystem. The tool-conventions repo has this example, annotation before the name:

;; option A
(func (@metadata.code.hotness "\01") $test (type 0).

The annotations repo has this example, annotation after the name:

;; option B
(func $lambda (@name "λ") (param $x (@name "α βγ δ") i32) (result i32) (local.get $x))

The first one is for the function, surely, given the name matches the function, and also the param has its own annotation later...? But that seems ambiguous in a function with no params or results: (func $foo (@something) (nop)) - is that annotation on the nop, or the function..?

The binaryen parser expects annotations before the entire function (same as for instructions):

;; option C
(@metadata.code.inline "\12") (func $annotated-func
  (local $x i32)

  (@metadata.code.branch_hint "\01")
  (if .. ;; annotated instruction in function

I can't figure out which of A, B, C is correct from the spec text. I may be misreading it, though, or not reading the right place? (In particular, I don't see details about how annotations work on instructions, like branch hinting or compilation hints use annotations - is that documented elsewhere?)

kripken avatar May 13 '25 18:05 kripken

This: https://github.com/WebAssembly/annotations/blob/main/document/core/appendix/custom.rst#:~:text=If%20both%20an%20identifier%20and%20a%20name%20annotation%20are%20given%2C%20the%20annotation%20is%20expected%20after%20the%20identifier

suggests that the annotation should go after the name being annotated. That is also consistent with how Java annotations work.

fgmccabe avatar May 13 '25 18:05 fgmccabe

@fgmccabe Hmm, is that part not specifically for Name annotations? It is in the "Name Annotations" subsection.

(Though maybe it suggests a general rule?)

kripken avatar May 13 '25 18:05 kripken

Yes, I was inferring a general rule from one example :)

fgmccabe avatar May 13 '25 18:05 fgmccabe

As far as the core spec (minus the Appendix) is concerned, annotations are just syntax without any prescribed meaning and can appear anywhere — just like custom sections. It's up to the custom spec to say anything more specific. That said, it would be good to share the same conventions.

rossberg avatar May 13 '25 18:05 rossberg

@rossberg

I see, thanks. Still, while the meaning of annotations is left to feature proposals, should the core spec not say something about what annotations attach to?

Concretely, consider the ambiguity problem mentioned above, that is,

(func $func
  (@metadata.code.something) ;; is this on the function or the nop?
  (nop)

Imagine an IDE that shows annotations when you hover over things. Even if the IDE does not understand an annotation's meaning, being able to show it is useful, but it needs to at least know what entity to attach the annotation to. If some IDEs show it on the function and others on the nop, that seems unfortunate.

kripken avatar May 13 '25 20:05 kripken

@kripken, the notion of attaching to anything at all is part of the semantics of the annotation, so cannot be described in general for all annotations. The best we can do is document non-normative best practices for the design of future annotations.

tlively avatar May 13 '25 21:05 tlively

What @tlively said. There is no presumed connection to the AST at all. All the core spec does is allowing the (lexical) syntax, in the same spirit that custom sections are allowed with no further assumptions. A tool or IDE is no more expected to be able to do anything useful with a random annotation that it doesn't understand than with a custom section that it doesn't understand.

rossberg avatar May 13 '25 21:05 rossberg

Interesting, I guess my perspective is different than you two... Intuitively, to me it seems reasonable to say something about the syntax of annotations (which includes relative position), but nothing about their semantics.

A tool or IDE is no more expected to be able to do anything useful with a random annotation that it doesn't understand than with a custom section that it doesn't understand.

Yes, but it can preserve that custom section: it can keep it there, unmodified, and in the same relative position to others. E.g. appending another section at the end would be ok (maybe even inserting in the middle, if later sections don't use absolute offsets...)

Right now, tools have a decent chance of preserving unknown custom sections in the binary. It would be nice if the text format had something like that too. (But maybe the place for it isn't in the core spec?)

kripken avatar May 13 '25 22:05 kripken

@kripken, tools that modify a module cannot preserve unknown custom sections either, since those generally would need updating as well, consider e.g. code positions in code metadata sections. That corresponds to knowing what a textual annotation is attached to.

A generic mechanism for AST annotations (in both text and binary) could of course be defined on top of what we have now. But I doubt it would be useful. Even with that, tools wouldn't really be able to preserve unknown sections/annotations, because they still cannot know whether or how they'd need to update or amend the contents of such annotations. For example, imagine a custom mechanism that encodes auxiliary type information for instructions in some way. Preserving unknown custom stuff in a non-broken manner seems fundamentally impossible to me.

rossberg avatar May 14 '25 06:05 rossberg

@rossberg

tools that modify a module cannot preserve unknown custom sections either, since those generally would need updating as well, consider e.g. code positions in code metadata sections. That corresponds to knowing what a textual annotation is attached to.

Yes, if the tool modifies code. But if the tool adds analysis without altering code, it can work.

For example, a tool that does a complex whole-program analysis to find unlikely code paths can insert a branch hinting section into the wasm. All other custom sections can be preserved without changes in the binary. This is a good property I think.

Now, imagine a similar tool working on the text format rather than the binary. If the syntax of text annotations is defined, it can insert new annotations without breaking existing ones.

For example, imagine a custom mechanism that encodes auxiliary type information for instructions in some way. Preserving unknown custom stuff in a non-broken manner seems fundamentally impossible to me.

Yes, definitely, this cannot work in general, if the wasm changes.

kripken avatar May 14 '25 17:05 kripken

@kripken, why would the same not work with the text format? As long as you are just inserting additional annotations, and you know where those have to go, isn't everything fine?

rossberg avatar May 14 '25 19:05 rossberg

@rossberg

Maybe I should say first that I don't feel strongly here, and you certainly know far better what makes sense in the spec. But something feels missing in the current situation, to me, for two reasons:

why would the same not work with the text format? As long as you are just inserting additional annotations, and you know where those have to go, isn't everything fine?

A tool might know where its annotations go, but might end up interfering with another annotation type. Consider the ambiguity problem from before:

(func $func
  (@metadata.code.old "\42") ;; is this on the func or the call?
  (call $something)

If there is no specced guarantee about where that annotation attaches, it could in theory attach to the function or to the call instruction, so tools would not know, and they might get this wrong. For example, say a tool adds a new annotation like this:

(func $func
  (@metadata.code.new "call_is_special") ;; new annotation
  (@metadata.code.old "\42")
  (call $something)

The tool wants that to be on the call as well, just another annotation on it. Now, if the old annotation was for the function, then we have mixed things up in a way that other tools might no longer parse (annotations are out of order, a code annotation before a function one, and maybe a tool stops parsing function annotations past the first code annotation).

Of course a tool can be "defensive" and add any new call annotation right on the call, which seems far less risky. But it seems nicer to have the syntactical attachments of annotations spelled out so that there is an obviously right way to do this.

  1. We can add a few lines about annotations syntax in tool-conventions, as @tlively has proposed. But is it not awkward for a core spec proposal like compilation hints to say "we follow the syntactic rules in tool-conventions, a non-spec place"?

kripken avatar May 14 '25 22:05 kripken

2. We can add a few lines about annotations syntax in tool-conventions, as @tlively has proposed. But is it not awkward for a core spec proposal like compilation hints to say "we follow the syntactic rules in tool-conventions, a non-spec place"?

It seems fine for a proposal overview to link to tool-conventions and say it is following the documented best-practices there. The final spec document should not mention tool-conventions, though (and would have no need to).

tlively avatar May 15 '25 00:05 tlively

I would like to clarify that while there is no general rule for arbitrary annotations, we do have a spec document specifically for code metadata sections/annotations: https://webassembly.github.io/branch-hinting/metadata/code/text.html

It is very bare-bones at the moment, but it would be the right place to specify that function-level code metadata annotations must appear right before the function definition. Currently it only mentions instruction-level annotations because I only added what was necessary for branch hinting.

yuri91 avatar May 15 '25 08:05 yuri91

@kripken, the same problem occurs in the binary format: for an unknown custom section, you cannot know whether it is meant to be placed after the preceding section or before the succeeding section, especially if there are some omitted sections in-between. It could even be logically placed between omitted sections.

Also keep in mind that custom annotations are meant as the textual representation of binary custom sections, and as such they should round-trip. So it doesn't make much sense to introduce generic mechanisms for one without having the same for the other. Consequently, generic AST attachment only makes sense if expressible (and understood) in both formats.

Re 2: With the historic exception of the name section (which we perhaps should move as well), custom sections and annotations are not part of the core spec, but specified in separate documents, so no problem there.

rossberg avatar May 15 '25 10:05 rossberg

@yuri91 Oh, thanks! That looks perfect!

@rossberg @tlively That spec text is really all I was looking for, specifically this part:

are considered attached to the first instruction that follows them.

So that resolves all my concerns with instruction annotations. (And for function annotations, the first proposal that adds them can update that spec text.)

@rossberg

the same problem occurs in the binary format: for an unknown custom section, you cannot know whether it is meant to be placed after the preceding section or before the succeeding section,

If sections interact and the order matters, then yes, without knowing all interactions things can't work. A lot can go wrong here! But I feel that @yuri91 's spec text avoids a lot of possible problems in the text format.

kripken avatar May 15 '25 15:05 kripken

Oh sure, I'm all for adding @yuri91's text. My reply was only concerned with the suggestions of specifying or handling this outside concrete annotation specs.

rossberg avatar May 15 '25 15:05 rossberg