gpy
gpy copied to clipboard
Refactor: Improve space insertion logic for Pinyin conversion
The previous approach to adding spaces was overly mechanical, indiscriminately inserting spaces without considering the context of surrounding characters. This resulted in unexpected spaces in the output.
This commit refactors the space insertion logic to be context-aware. It now checks if adjacent characters belong to unicode.Punct or unicode.Symbol categories. Spaces are only inserted if the neighboring characters are not punctuation or symbols. This eliminates the need for a separate replacement step to remove redundant spaces added by the previous mechanical approach.
Additionally, the "allowed characters" setting has been removed. This ensures that all content from the original text is displayed in the Pinyin output, preventing the loss of characters such as book titles marks like 《》 and French characters, which were previously excluded by the character filtering mechanism.