SATySFi
SATySFi copied to clipboard
New API to fix footnote duplication problem
We sometimes encounter the needs to evaluate inline-text
or block-text
more than one time.
(For example, +xgenlisting command in satysfi-enumitem )
Evaluating inline-text twice occurs many problems, for example, counter of \footnote
incremented twice:
http://satysfi-playground.tech/permalink/71a660eee721e76a94ba063272874e37eafdff3970fa8ba95d6d923bc4efef32
To prevent this, I suggest new API to prevent such problems for commands using let-mutable
.
Design
get-command-identity ctx: context -> string
it returns some hash value to identify the command which the ctx
given for.
Using this API, the \footnote
command (or FootnoteScheme) will be defined like:
let-mutable footnote-ref <- 0 in
let mutable footnote-dict <- [] in
let-inline ctx \footnote it =
%...
let hash = get-command-identity ctx in
if hash is in footnote-dict then
footnote-ref <- !footnote-ref + 1
else
footnote-dict <- hash :: !footnote-dict
in
% ...
Example
For simple, the hash is generated from the number of byte location where the command is used in inline-text (determined when generating AST).
But in complicated cases, it will not work:
let-inline ctx \footnote-wrap it =
read-inline ctx {
\footnote(#it;); % (a) bytes from file head
}
in
document '<
+p {
\footnote-wrap { hello } % (b) bytes from file head
\footnote-wrap { world } % (c) bytes from file head
}
>
Correct behavior of the code is displaying hello
and world
in footnote. So we want to make the hashes different.
To solve this, make ctx
contain the history of byte locations and calculete the hash as
- in \footnote call of \footnote-wrap { hello }
--
get-command-identity ctx
returnshash([(a); (b)])
- in \footnote call of \footnote-wrap { world }
--
get-command-identity ctx
returnshash([(a); (c)])
To be short:
現在のSATySFiでは、inline-text
に\footer
等が含まれている場合、そのinline-text
を2回評価すると、footerのカウントが2回インクリメントされてしまいます。
これを回避するために
get-command-identity ctx: context -> string
というAPIを提案します。これは、
let-inline ctx \footer-wrap it = %...
in
% ...
{
\footer-wrap{ hello }
}
のように、inline-text
中でコマンド(ここでは\footer-wrap
)が使われるたびに、ファイル先頭からのバイト位置をctx
中のリストに追加していきます。get-command-identity ctx
が呼び出された時、このリストの内容を元にハッシュを生成します。
\footnote
等、let-mutable
な変数を更新するコマンドでは、get-command-identity
が提供するハッシュ情報を登録する辞書を作成し、その辞書にハッシュが登録されていない場合のみmutable変数を更新するようにします。
Is that a problem? Does it mean that all state-mutating commands that may be used inside +xgenlisting
or something alike are forced to use get-command-identity
if one wants to avoid unintended behavior? I consider using read-inline
to the same inline-text
twice itself as a problem. As for +xgenlisting
, another extension to SATySFi might be needed to deal with state-mutating commands, but I argue that it is not a good solution to obligate providers of commands like \footnote
to manage effects in such a sophisticated way.
Thank you for having a discussion (& sorry for the late response).
As to the language design for inline texts and inline box rows, I have a thought close to @elpinal -san's one. That is, I suppose that applying read-inline
twice to the same inline texts itself is a somewhat problematic usage. Inline texts in general have effects of mutating states, and thus basically they can be regarded as “affine” resources (though this is not reflected in type-level restriction).
Certainly, I also feel a slight need to consider that there would be some case where inline texts are essentially required to be used more than once. For instance, consider the case where there’s more than one choice of how to render it : inline-text
depending on the total size of the inline box rows resulting from it
:
let-inline ctx \decorate it =
let ib1 = read-inline (some-settings-1 ctx) it in
let ib2 = read-inline (some-settings-2 ctx) it in
if first-one-is-better (get-natural-metrics ib1) (get-natural-metrics ib2) then
ib1
else
ib2
IMHO, however, adding primitives like get-command-identity
seems to introduce too much complication to the semantics of the language. I feel that how to solve such a problem is rather in the scope of the language design than that of just adding primitives. For example, if SATySFi has a kind of state-passing semantics (like that of Elm or React) and is free from mutable references, one can safely implement the command above by:
let-inline state ctx \decorate it =
let (state1, ib1) = read-inline state (some-settings-1 ctx) it in
let (state2, ib2) = read-inline state (some-settings-2 ctx) it in
if first-one-is-better (get-natural-metrics ib1) (get-natural-metrics ib2) then
(state1, ib1)
else
(state2, ib2)
(though this tends to make code somewhat redundant.)
Thanks for the discussion. I also agree with state-passing syntax, but it is very breakable change to the current syntax. For the first step to make mutable variables obsolete, I suggest the following syntax:
(Type.t
is inspired by SATySFiでad hoc多相)
set-context-variable : string -> Type.t -> 'a -> context -> context
get-context-variable : string -> Type.t -> context -> 'a option
duplicate-context : context -> context
apply-context : context -> context -> ()
The goal of this syntax is to put all mutable variables inside of context
.
Compared to current syntax
- The update timing of mutable variables are more clear.
- Solve problems with
let-mutable
references.
Compared to state-passing syntax
- Does not break backward compatibility.
- Less redundant.
- Not cool design.
If compositing state into context
is unsound, How about replacing current SATySFi's context
to ('a, context)
?
I think it is more compatible way to use state
and context
separately, and we do not have to add primitive like *-context-variable
, duplicate-context
and apply-context
. However, this way cannot diminish let-mutable
, because command provider like \footnote
should manage its state corresponded to 'a
For example, regarding SATySFi's context
as (int list, old-context)
, we can implement mutable behavior using let-mutable
like:
let-mutable identical-number <- 0 in
% duplicate-context : context -> context
let duplicate-context ctx =
let (l, etc) = ctx in
let l = !identical-number :: l in
let () = identical-number <- !identical-number + 1 in
(l, etc)
in
let-mutable mutable-state <- Dict.make in
% get-context-variable : string -> context -> int option
let get-context-variable str ctx =
let-rec inner l =
match Dict.get(l, str) !mutable-state with
| Some(r) -> Some(r)
| None -> match l with
| _ :: l -> inner l
| _ -> None
in
let (l, _) = ctx in
inner l
in
% set-context-variable : string -> int -> context -> ()
let set-context-variable str num ctx =
let (l, _) = ctx in
let () = mutable-state <- Dict.set (l, str) num !mutable-state in
()
in
let-inline ctx \footnote it =
% ...
let n = get-context-variable `footnote-number` ctx in
let () = set-context-variable `footnote-number` ctx (n + 1) in
% ...
let-inline ctx \eval-twice it =
let tmp-ctx = duplicate-context ctx in
let tmp-ib = read-inline tmp-ctx it in
let measuring = get-natural-metrics tmp-ib in
read-inline (some-settings measuring ctx) it
Thanks for additional suggestions. I have a few remarks, however:
- The first suggestion looks unrealizable, since contexts are passed “from the outside to the inside” but the opposite never happens (i.e., every command does not return an updated context).
- This is why I consider introducing immutable states, which can be returned by commands “from the inside to the outside”.
- I don’t really understand what
duplicate-context : context -> context
andapply-context : context -> context -> ()
in the first suggestion are intended to be, but as long as these primitives are meaningful, their existence at least indicates that contexts have mutable state. Such a semantics does not seem more elegant than ones having mutable references. - The second suggestion is not backward-compatible, since it will at least break the completeness of the type inference and will require every command (and thereby inline texts) to have a type parametrized by
'a
of'a * context
. - Now that a few years have passed since SATySFi was released for the first time and many imperfections of the language design have come out at that time, I’m not so reluctant to break the backward compatibility, as long as the versioning is carefully handled.
- For instance, I’m currently replacing the module system with that based on F-ing modules.
(The following is a rough translation of the response above.)
さらに提案頂いてありがたいです.ただ,いくつか指摘したいことがありました:
- 1案目による実現は困難そうです.というのも,現状のテキスト処理文脈は “外側から内側へ渡される” ことはあっても逆はできないようになっているためです.
- まさにこれがイミュータブルな状態を持ち回る意味論の案を紹介した理由です(この方式なら “内側から外側へ状態を伝播する” ことが可能です).
- 1案目にある
duplicate-context : context -> context
とapply-context : context -> context -> ()
がどのような操作を指しているのかあまりわかっていませんが,少なくともこのような操作が意味をもつならばテキスト処理文脈は内部に書き換え可能な状態を持っていることになり,それはミュータブルな参照がある意味論と比べて簡潔になっていないように思います. - 2案目は残念ながら後方互換ではないと思います.というのも,第0引数の形式が複数ありうることにすると少なくとも型推論の完全性を失ってしまうほか,あらゆるコマンド定義(ひいてはインラインテキスト)はテキスト処理文脈のもつ多相性を型パラメータとしてもつ必要が出てくるためです.
- 今やSATySFiの最初のリリースから数年が経ち,その間に現状の言語設計にもいろいろ弱点があることがわかってきたので,ヴァージョニングに関して適切に配慮できる限り非互換な変更を施すのも吝かではないかなと思っています.
- 例えば,モジュールシステムは現在のものからF-ing moduleに基づくものに置き換えようと実装を進めています.
Thanks for reply.
The second suggestion is not backward-compatible
It can be compatible if the language restrict 'a
to int list
. In fact, in above example, 'a'
is bound to int list
.
Thanks for explanation of philosophy and I understood that context
should immutable in language design. However, the state-passing example is redundant. Is there any other solution for this so far?