SATySFi
SATySFi copied to clipboard
Add char
Close https://github.com/gfngfn/SATySFi/issues/407
I added char
type and some functions.
The char
type uses Uchar.t
type for implementation.
ref: https://github.com/gfngfn/SATySFi/projects/1#card-54377937
List of added functions:
-
char-to-string : char -> string
-
char-to-unicode-point : char -> int
-
char-of-unicode-point : int -> char
-
char-same : char -> char -> bool
See the tests/char.saty
file for how to use them.
🤔🤔🤔
### output ###
# Error: Unable to set up the cache root directory:
# Unix.Unix_error(Unix.EEXIST, "mkdir",
# "/Users/runner/.cache/dune/db/files/v4")
https://github.com/gfngfn/SATySFi/runs/3682988795?check_suite_focus=true#step:5:5926
-
*-unicode-point
should be renamed for*-unicode-scalar-value
because “Unicode point” is not a term (See “Unicode scalar value”). Did you mean “Unicode code point”)?. - Is
*-same
conventional? I would expectchar-same
should bechar-equal
or something because this is the equality defined onchar
. (I'd love to hear @gfngfn’s opinion)
This is my just two cents.
Can this be implemented as a macro like ~(char @`2`)
with char : input-position * string -> char
?
Otherwise, is it possible to introduce more generic literal syntax (like Scala’s String Interpolation, C++’s User defined literals, Lisp-families’ read macros, SRFI-10) instead of one specific to char
?
For example, define a new syntax @⟨ident_tag⟩⟨string-literal⟩
(e.g., @char`a`
) which will be parsed as ~(⟨ident_tag⟩ @⟨string-literal⟩)
where ⟨ident_tag⟩
should be a function with type input-position * string -> char
. If the string
is not valid, the function will call abort-with-message
. It may be better to introduce another name space for tags that are defined with a new syntax let-literal @⟨ident_tag⟩ ⟨args⟩ = ⟨expr⟩
.
This is another my two cents. Another option would be “not to introduce a literal syntax for char
at all”. I believe casual users shouldn’t care what a Unicode Scalar Value is. I'm not a big fan of user-facing APIs requiring char arguments; they will likely not consider Unicode equivalence.
Thank you for pointing out the naming of the primitive functions.
I explain why I introduced the literal syntax of the char
type.
I would like to use the char
type when parsing strings. So I expect the char
type to have two properties:
- Guaranteed to be exactly one character
- Pattern matching is possible
Currently, I must do pattern matching with Unicode scalar value: https://github.com/puripuri2100/SATySFi-json/blob/master/src/json.satyg#L78
Hmm, then can we introduce a macro function char : string -> int
and make it available at matching clauses?
val f x =
match x with
| ~(char `/`) -> lex-string (str-stack^`"`) line (column + 2) ys
or with a new macro syntax @⟨ident_tag⟩⟨string-literal⟩
,
val f x =
match x with
| @char`/` -> lex-string (str-stack^`"`) line (column + 2) ys
Otherwise, we can extend matching with view patterns
val char c =
match string-length c with
| 0 -> ``
| 1 -> string-sub c 0 1
| _ -> ``
end
val f x =
match x with
| (char -> `\`) -> lex-string (str-stack^`"`) line (column + 2) ys
or extractors
val match Char c =
match string-length c with
| 0 -> None
| 1 -> Some (string-sub c 0 1)
| _ -> None
end
val f x =
match x with
| Char(`\`) -> lex-string (str-stack^`"`) line (column + 2) ys