SATySFi icon indicating copy to clipboard operation
SATySFi copied to clipboard

Add char

Open puripuri2100 opened this issue 2 years ago • 6 comments

Close https://github.com/gfngfn/SATySFi/issues/407

I added char type and some functions.

The char type uses Uchar.t type for implementation.

ref: https://github.com/gfngfn/SATySFi/projects/1#card-54377937

List of added functions:

  • char-to-string : char -> string
  • char-to-unicode-point : char -> int
  • char-of-unicode-point : int -> char
  • char-same : char -> char -> bool

See the tests/char.saty file for how to use them.

puripuri2100 avatar Sep 23 '21 04:09 puripuri2100

🤔🤔🤔

### output ###
# Error: Unable to set up the cache root directory:
# Unix.Unix_error(Unix.EEXIST, "mkdir",
# "/Users/runner/.cache/dune/db/files/v4")

https://github.com/gfngfn/SATySFi/runs/3682988795?check_suite_focus=true#step:5:5926

puripuri2100 avatar Sep 23 '21 04:09 puripuri2100

  1. *-unicode-point should be renamed for *-unicode-scalar-value because “Unicode point” is not a term (See “Unicode scalar value”). Did you mean “Unicode code point”)?.
  2. Is *-same conventional? I would expect char-same should be char-equal or something because this is the equality defined on char. (I'd love to hear @gfngfn’s opinion)

na4zagin3 avatar Sep 23 '21 07:09 na4zagin3

This is my just two cents.

Can this be implemented as a macro like ~(char @`2`) with char : input-position * string -> char?

Otherwise, is it possible to introduce more generic literal syntax (like Scala’s String Interpolation, C++’s User defined literals, Lisp-families’ read macros, SRFI-10) instead of one specific to char?

For example, define a new syntax @⟨ident_tag⟩⟨string-literal⟩ (e.g., @char`a`) which will be parsed as ~(⟨ident_tag⟩ @⟨string-literal⟩) where ⟨ident_tag⟩ should be a function with type input-position * string -> char. If the string is not valid, the function will call abort-with-message. It may be better to introduce another name space for tags that are defined with a new syntax let-literal @⟨ident_tag⟩ ⟨args⟩ = ⟨expr⟩.

na4zagin3 avatar Sep 23 '21 08:09 na4zagin3

This is another my two cents. Another option would be “not to introduce a literal syntax for char at all”. I believe casual users shouldn’t care what a Unicode Scalar Value is. I'm not a big fan of user-facing APIs requiring char arguments; they will likely not consider Unicode equivalence.

na4zagin3 avatar Sep 23 '21 15:09 na4zagin3

Thank you for pointing out the naming of the primitive functions.

I explain why I introduced the literal syntax of the char type. I would like to use the char type when parsing strings. So I expect the char type to have two properties:

  • Guaranteed to be exactly one character
  • Pattern matching is possible

Currently, I must do pattern matching with Unicode scalar value: https://github.com/puripuri2100/SATySFi-json/blob/master/src/json.satyg#L78

puripuri2100 avatar Jan 12 '22 12:01 puripuri2100

Hmm, then can we introduce a macro function char : string -> int and make it available at matching clauses?

val f x =
  match x with
  | ~(char `/`) -> lex-string (str-stack^`"`) line (column + 2) ys

or with a new macro syntax @⟨ident_tag⟩⟨string-literal⟩,

val f x =
  match x with
  | @char`/` -> lex-string (str-stack^`"`) line (column + 2) ys

Otherwise, we can extend matching with view patterns

val char c =
  match string-length c with
  | 0 -> ``
  | 1 -> string-sub c 0 1
  | _ -> ``
  end

val f x =
  match x with
  | (char -> `\`) -> lex-string (str-stack^`"`) line (column + 2) ys

or extractors

val match Char c =
  match string-length c with
  | 0 -> None
  | 1 -> Some (string-sub c 0 1)
  | _ -> None
  end

val f x =
  match x with
  | Char(`\`) -> lex-string (str-stack^`"`) line (column + 2) ys

na4zagin3 avatar Apr 16 '22 09:04 na4zagin3