pxt-microbit icon indicating copy to clipboard operation
pxt-microbit copied to clipboard

The text in some lessons is broken

Open THEb0nny opened this issue 1 year ago • 10 comments
trafficstars

In some lessons the text is broken due to some characters and makecode interrupts the display of the line.

image image

THEb0nny avatar Sep 21 '24 18:09 THEb0nny

image

THEb0nny avatar Sep 21 '24 18:09 THEb0nny

It's worth checking out the other lessons too.

THEb0nny avatar Sep 21 '24 18:09 THEb0nny

image

THEb0nny avatar Sep 21 '24 18:09 THEb0nny

image image

THEb0nny avatar Sep 21 '24 18:09 THEb0nny

@ganicke is this a documentation issue?

abchatra avatar Sep 23 '24 22:09 abchatra

@abchatra - in a way, yes. Some of those icon type characters don't parse well when uploaded to Crowdin. Crowdin will terminate sentences early when encountered typically.

ganicke avatar Sep 25 '24 17:09 ganicke

@abchatra - So, I verified that the source arrives to Crowdin intact.

image

It's when presented in the editor that they truncate strings on certain special characters. In some languages the translators have fixed this by adding the icon chars back in their translation.

image

This seems to be a Crowdin issue. I could send them a bug report for this?

ganicke avatar Oct 01 '24 22:10 ganicke

Support message for this sent to Crowdin 10/16. Awaiting a response...

ganicke avatar Oct 16 '24 21:10 ganicke

@abchatra - So, I received a good response from Crowdin Support mentioning the possible use of segmentation rules to avoid breaks on the emoji/icon characters:

Hello there, 

For markdown, you can use custom segmentation rules:
https://support.crowdin.com/custom-segmentation/

We have plenty of possible custom modules (https://store.crowdin.com/tags/file-processors),
but changing a segmentation should solve this without much development work. 

In case it wouldn't help, please share with source file sample as an attachment to an email,
a screenshot of how it looks in Crowdin editor, and the project ID (or URL)

Thanks in advance, 
--
Sincerely,
Dima Yashchyshyn
Customer Success Manager

This does require, however, a segmentation (SRX) file added to support EACH source file needing custom segmentation. Otherwise, segmentation could be disabled on the source file and no strings would be parsed leaving the file as one blob text to translate in whole.

Creating an SRX file for these chars would add a new rule to NOT break (dice.md for example):

<rule break="no">
        <beforebreak>[🎲⭐👋]</beforebreak>
        <afterbreak>\s</afterbreak>
</rule>

This doesn't seem like a practical solution at this point. Not sure if modifying the default SRX is possible where we could set the whole range of these emojis to not break.?.?

ganicke avatar Oct 18 '24 22:10 ganicke

Thanks @ganicke for investigating this. @jwunderl @thsparks FYI

abchatra avatar Oct 22 '24 01:10 abchatra

Idea is replace emojis with special markers and replace them with emojis in real time when serving.

abchatra avatar May 06 '25 16:05 abchatra