kanjicanvas icon indicating copy to clipboard operation
kanjicanvas copied to clipboard

Adding a Dataset for Characters Represented by Surrogate Pairs

Open ubugeeei opened this issue 10 months ago • 0 comments

Adding a Dataset for Characters Represented by Surrogate Pairs

Abstraction

Thank you for the wonderful tool. I really like this project. 😄

When trying to add a dataset, I encountered some issues using jTegaki.

Background Only One Character with a Single Code Point Can Be Registered

When setting the background, Unicode input is required, but characters represented by surrogate pairs are not considered.

Looking into the jTegaki source code decompiled with Java Decompiler, I found that it simply parses the hex of one block into a number and casts it to a char.

int val_of_si = Integer.parseInt(s, 16);
char c_si = (char)val_of_si;
String si = Character.toString(c_si);
this.sp.setBackground(si);
this.sp.repaint();

Request

Could jTegaki be improved to allow the addition of a dataset for characters represented by surrogate pairs?

String input = "845B DB40 DD00"; // input from

String[] codePoints = input.split(" ");
StringBuilder result = new StringBuilder();

for (String codePoint : codePoints) {
    int cp = Integer.parseInt(codePoint, 16);
    result.append(Character.toChars(cp));
}

I considered making the improvement myself, but since the source code is not publicly available and the license is unclear, I decided to raise an issue.

ubugeeei avatar Apr 24 '24 02:04 ubugeeei