WeasyPrint
WeasyPrint copied to clipboard
'text-transform: capitalize' makes letters that aren't the first of each word lowercase
Just a quick bug I came across, the css property: 'text-transform: capitalize' makes every letter in the word lowercase, except for the first which is made uppercase. According to the spec, only the first letter should be modified and other letters should remain as typed.
<!DOCTYPE html>
<html>
<head>
<style>
body {
text-transform: capitalize;
}
</style>
</head>
<body>
my UPPER text
</body>
</html>
Thanks for this bug report!
Nobody ever complained since this feature has been added more than 10 years ago in 6ee2bad.
The following seems to mimic the CSS property:
CAPITALIZE_RE = re.compile('\s*(^\W*|\s\W*)(\w)', re.MULTILINE)
CAPITALIZE_RE.sub(lambda m: m.group(1) + m.group(2).upper(), text)
The following seems to mimic the CSS property:
We don’t want to change the first letter but the first typographic letter unit. CSS is never easy…
The function already exists somewhere for the :first-letter
selector, I hope that it’s the same definition.
Ah sorry for the multiple pings here. I had to make a few small changes and didn't think it would link here until I'd made the pull request.
I've modified the capitalize function to use unicode typographic letter units (which required the use of the regex module in order to support unicode grapheme matching). I've run some tests and it appears to be matching the CSS behaviour when applying "text-transform: capitalize".
I spent some time reading through the CSS documentation regarding typesetting and what defines a "letter" in this context, and I believe this now works as expected.
The following seems to mimic the CSS property:
CAPITALIZE_RE = re.compile('\s*(^\W*|\s\W*)(\w)', re.MULTILINE) CAPITALIZE_RE.sub(lambda m: m.group(1) + m.group(2).upper(), text)
Thanks for starting me on the right path :+1:
Thanks @VeteraNovis! It fixed by #1703.