afdko icon indicating copy to clipboard operation
afdko copied to clipboard

[spec] Deprecate anonymous data blocks

Open iterumllc opened this issue 4 years ago • 6 comments

Description

This change deprecates anonymous data blocks (section 10 of the specification) and adds a warning to makeotfexe when one is encountered. The changes are parallel to the except keyword.

makeotfexe has lacked any interface for retrieving anonymous blocks as far back as the present repository goes and it is likely it never had one.

Checklist:

  • [x] I have followed the Contribution Guidelines
  • [x] I have verified that new and existing tests pass locally with my changes
  • [x] I have performed a self-review of my own code
  • [x] I have made corresponding changes to the documentation

iterumllc avatar Apr 26 '21 20:04 iterumllc

Do other implementations make use of these anonymous blocks? feaLib parses them and I think @mhosken uses (used to use?) them in some of SIL font tools.

khaledhosny avatar Apr 26 '21 21:04 khaledhosny

Do other implementations make use of these anonymous blocks? feaLib parses them and I think @mhosken uses (used to use?) them in some of SIL font tools.

@khaledhosny Thanks, this is a good point. It would be good to hear from others as to how much of a problem this deprecation would cause. We are definitely going to be removing the (non-) functionality from makeotfexe regardless; the question is whether we should leave it as part of the overall FEA spec (with a note that it's not supported in the AFDKO implementation).

cc: @anthrotype @justvanrossum @simoncozens

josh-hadley avatar Apr 26 '21 22:04 josh-hadley

Please also bump the following verison numbers/dates:

Done

iterumllc avatar Apr 26 '21 23:04 iterumllc

We are definitely going to be removing the (non-) functionality from makeotfexe regardless;

That should be fine, since makeotfexe didn‘t do anything with it anyway, but deprecating it in the spec is a bit too much IMHO, especially that makeotfexe is not the only implementation (or even the only widely used one).

khaledhosny avatar Apr 27 '21 10:04 khaledhosny

Just to add a bit of context where this is coming from:

It looks like there could be significant new additions to the feature file grammar in the near future related to variable fonts. That could put other projects in the same position as this one: wanting to move into that future but facing the challenge of an old parsing implementation. With that in mind, retiring or altering rarely used features that complicate the parsing process could be a virtue.

Almost all of the current grammar can be adapted to a straightforward "lexing then parsing" model. That's tricky with anon blocks because of this combination: they're unstructured in the middle and have a context-specific terminator (you need to "know" the start tag to find the end tag). Most parser-generators will give you some way of dealing with this so it's not a deal-breaker, just tricky.

In my opinion the current semantic is not what you want for the advertised functionality anyway. A 3-7 character terminator (newline, closing-brace, maybe a space, 1-4 characters of tag, semicolon) is too meager to terminate truly a unstructured (e.g. binary) data block. If support for truly unstructured blocks isn't needed then an indentation semantic (e.g. where 4 characters after each newline are required and removed) would better match contemporary practice (but there are many other options).

Anyway, the point is not just that there's no need for these in this repo's implementation. It's partly that they're not a great solution to the supposed problem, so if it's an actual problem and they're at most rarely used we might consider better solutions.

iterumllc avatar Apr 27 '21 11:04 iterumllc

I've been able to handle anonymous blocks with Antlr 4 fairly cleanly by using its import mechanism, so that the "main" grammar file can be free of target-language code. So maybe there is less motivation for this change.

Still, I think the relatively low adoption of anon blocks reflects their design, which is something worth thinking about and potentially fixing. The terminator string is short and strange, and the idea that anyone would be likely to put binary data in a block is unlikely, as the rest of the content is typically changed with a text editor. (Perhaps this is less true with generated feature files, but in that case it's not clear that anon blocks are very generator-friendly either.)

iterumllc avatar May 17 '21 11:05 iterumllc

This doesn't need to happen.

skef avatar Jul 31 '23 22:07 skef