monkeytype icon indicating copy to clipboard operation
monkeytype copied to clipboard

chinese quotes treated as a single word

Open extoplasm opened this issue 9 months ago • 29 comments

Did you clear cache before opening an issue?

  • [X] I have cleared my cache

Is there an existing issue for this?

  • [X] I have searched the existing issues

Does the issue happen when logged in?

Yes

Does the issue happen when logged out?

Yes

Does the issue happen in incognito mode when logged in?

Yes

Does the issue happen in incognito mode when logged out?

Yes

Account name

extoplasm

Account config

{"theme":"alduin","themeLight":"serika","themeDark":"serika_dark","autoSwitchTheme":false,"customTheme":false,"customThemeColors":["#323437","#e2b714","#e2b714","#646669","#000000","#d1d0c5","#ca4754","#7e2a33","#ca4754","#7e2a33"],"favThemes":[],"showKeyTips":true,"smoothCaret":"medium","quickRestart":"off","punctuation":false,"numbers":false,"words":10,"time":60,"mode":"quote","quoteLength":[0],"language":"chinese_simplified","fontSize":1.5,"freedomMode":true,"difficulty":"normal","blindMode":false,"quickEnd":false,"caretStyle":"default","paceCaretStyle":"default","flipTestColors":false,"layout":"default","funbox":"none","confidenceMode":"off","indicateTypos":"off","timerStyle":"mini","liveSpeedStyle":"off","liveAccStyle":"off","liveBurstStyle":"off","colorfulMode":false,"randomTheme":"off","timerColor":"main","timerOpacity":"1","stopOnError":"off","showAllLines":false,"keymapMode":"off","keymapStyle":"staggered","keymapLegendStyle":"lowercase","keymapLayout":"qwerty","keymapShowTopRow":"layout","fontFamily":"JetBrains_Mono","smoothLineScroll":false,"alwaysShowDecimalPlaces":false,"alwaysShowWordsHistory":false,"singleListCommandLine":"manual","capsLockWarning":true,"playSoundOnError":"off","playSoundOnClick":"9","soundVolume":"1.0","startGraphsAtZero":true,"showOutOfFocusWarning":true,"paceCaret":"pb","paceCaretCustomSpeed":1,"repeatedPace":true,"accountChart":["on","on","on","on"],"minWpm":"off","minWpmCustomSpeed":100,"highlightMode":"letter","typingSpeedUnit":"wpm","ads":"result","hideExtraLetters":false,"strictSpace":false,"minAcc":"off","minAccCustom":90,"monkey":false,"repeatQuotes":"off","oppositeShiftMode":"off","customBackground":"","customBackgroundSize":"cover","customBackgroundFilter":[0,1,1,1,1],"customLayoutfluid":"qwerty#dvorak#colemak","monkeyPowerLevel":"off","minBurst":"off","minBurstCustomSpeed":100,"burstHeatmap":true,"britishEnglish":false,"lazyMode":false,"showAverage":"off","tapeMode":"off","maxLineWidth":0}

Current Behavior

image

when typing in chinese, entire quote is treated as one word -> whenever space is pressed the test finishes, also every quote is in the short category.

Expected Behavior

could count every character excluding punctuation as a word

Steps To Reproduce

  1. change language to chinese simplified
  2. go to quotes
  3. press space
  4. test finished

Environment

  • OS: Windows 10
  • Browser: Google Chrome
  • Browser Version: Version 124.0.6367.119 (Official Build) (64-bit)

Anything else?

No response

extoplasm avatar May 03 '24 08:05 extoplasm

If that can be fixed, I believe the spaces in the "words" section should be typed automatically too, as a sentence in Simplified Chinese does not include spaces. e.g.: In "只有 出现 革命 存在 发生 方法…", users should not need to hit the spacebar before entering the next word.

faq0 avatar May 03 '24 12:05 faq0

i reckon you keep the words the same, it’s good to separate the words

but just count every character in a sentence as a word in the quotes section

extoplasm avatar May 04 '24 00:05 extoplasm

The characters used are full-width commas, and as @faq0 said, simplified chinese does not include spaces, so im not sure what should be done here.

Miodec avatar May 06 '24 12:05 Miodec

I believe there are some commonly used full-width punctuation marks in simplified Chinese, which can be set as an exception in the quote mode. e.g.: Some of these include ",。!?“”:;《》—", have the unicode \uff0c\u3002\uff01\uff1f\u201c\u201d\uff1a\uff1b\u300a\u300b\u2014.

But for the zen or custom modes, they might need other rules as the punctuation marks are not limited to these characters.

However, I have noticed that, in fact, many Chinese typing practice websites do actually count symbols as a character, being calculated towards the WPM. That might be an easy way for that. image

faq0 avatar May 06 '24 13:05 faq0

the punctuation isn't an issue, there isn't much punctuation in the quotes anyways, i reckon you can count every character as a word and parse out the full width punctuation or change it into its english equivalent when counting the words although this would be rough to implement.

it's really up to you, but as a quasi-mandarin speaker this is just my suggestion.

extoplasm avatar May 07 '24 12:05 extoplasm

So, whats the solution? Because if you want to add spaces you would need to edit the quotes themselves.

Miodec avatar May 08 '24 12:05 Miodec

wdym, i’m saying we count each character as a word, as mandarin doesn’t follow the rule that each word is separated by spaces. eg. “猴子打字” (monkey type lol) counted as 4 separate words

extoplasm avatar May 09 '24 05:05 extoplasm

also if we add spaces it wouldn’t be accurate, not sure how the word counting works but a special case can be added to split the characters differently (removing the punctuation before of course)

extoplasm avatar May 09 '24 05:05 extoplasm

So, this should be the case for all chinese text, not just quotes right.

Is this because you need multiple keypresses per character? Maybe we can count each keypress as a character, instead of each character as a word.

Miodec avatar May 09 '24 14:05 Miodec

So, this should be the case for all chinese text, not just quotes right.

Yes.

Maybe we can count each keypress as a character, instead of each character as a word.

This would be good in most cases, but I believe that could be the way to calculate the speed, not the accuracy. In fact, there are mutliple typing methods in Simplified Chinese that might result in different number of keystrokes.

e.g.: For an example quote "我能吞下玻璃而不伤身体", In Full Pinyin, it would be "wonengtunxiabolierbushangshenti" (31 chars). In Double Pinyin, it would be "wongtpxwboliorbuuhufti" (2 keys/word, total 22 chars). For Wubi, that would be 4 keys/word, total 44 chars. But in this case, there is lower amount of time needed to select the desired Chinese characted in the candidate window.

faq0 avatar May 09 '24 21:05 faq0

yes i agree with faq0 on the speed calculation part but the main issue is that in the quotes the entire sentence is counted as one word, i’m suggesting that we split the quote by character instead of by space as when someone presses space the test ends and the progress is inaccurate

extoplasm avatar May 10 '24 05:05 extoplasm

yes i agree with faq0 on the speed calculation part but the main issue is that in the quotes the entire sentence is counted as one word, i’m suggesting that we split the quote by character instead of by space as when someone presses space the test ends and the progress is inaccurate

If you split by character then the website will require you to press space between every chracter. When you type quotes normally, when do you press space? (not on monkeytype).

Miodec avatar May 13 '24 09:05 Miodec

in chinese there is no such thing as a space lol if its like that then there might not be an easy solution perhaps make a special case??? because im like 50% sure its the same for any asian language, this could be good if adding quotes for other languages

extoplasm avatar May 13 '24 10:05 extoplasm

If you split by character then the website will require you to press space between every chracter. When you type quotes normally, when do you press space? (not on monkeytype).

We might not press space for every character. In fact, there is a candidate window (IME window) to choose from a list of characters.

We might not press the space key. If I want to type the character**"我"** in Full Pinyin, that would be: What I type: w o <spacebar>. In this case, the candidate window will be (Microsoft Pinyin IME as an example): image I have to select one of the desired character in the candidate list, whereas "1" = "我", "2" = "喔", etc.. I can also press the spacebar as an alternative to select the first option (the spacebar is more commonly used than "1" when selecting the first option).

We might not press the key for every character. In a longer sentence, such as "我能吞下玻璃而不伤身体", I can type the sentence at once. In Full Pinyin, this would be: What I type: w o n e n g t u n x i a b o l i e r b u s h a n g s h e n t i <spacebar>. image It is lucky that in this case, my desired sentence is at the first place. I can press spacebar. However, if that isn't the case. I may have to select each character (or word) one by one, divided using the apostrophe shown in the IME. For example, image

This means that there are many ways to type a sentence, with some of them not containing a spacebar keystroke. I believe that monkeytype should just detect the number of keystrokes when a character itself is typed.

faq0 avatar May 13 '24 10:05 faq0

in chinese there is no such thing as a space lol if its like that then there might not be an easy solution perhaps make a special case??? because im like 50% sure its the same for any asian language, this could be good if adding quotes for other languages

What if i just disable space then? Monkeytype wont try to "move to the next word" because there would be no "next word" and that "moving to the next word" wont even be triggered by the space. The only thing the space would be doing is interacting with the input manager, like it already does.

Miodec avatar May 13 '24 10:05 Miodec

I believe that monkeytype should just detect the number of keystrokes when a character itself is typed.

what does this mean?

extoplasm avatar May 13 '24 10:05 extoplasm

What if i just disable space then? Monkeytype wont try to "move to the next word" because there would be no "next word" and that "moving to the next word" wont even be triggered by the space. The only thing the space would be doing is interacting with the input manager, like it already does.

this should be good enough haha

extoplasm avatar May 13 '24 10:05 extoplasm

what does this mean?

Keystroke per second is calculated based on the number of keystrokes, which will be shown on the final speed chart, while the accuracy and WPM is calculated based on the typed Chinese characters per second.

faq0 avatar May 13 '24 10:05 faq0

What if i just disable space then? Monkeytype wont try to "move to the next word" because there would be no "next word" and that "moving to the next word" wont even be triggered by the space. The only thing the space would be doing is interacting with the input manager, like it already does.

This should be a good idea, as long as it can deal with the speed and accuracy correctly.

faq0 avatar May 13 '24 10:05 faq0

another problem might be that 1 misspelt character results in the test being unable to finish, as when u disable space, it will stop the test from force finishing as monkeytype does not let you finish on a misspelt word.

extoplasm avatar May 13 '24 10:05 extoplasm

im pretty sure you have to both split quote by character and disable spaces

extoplasm avatar May 13 '24 10:05 extoplasm