tabulate
tabulate copied to clipboard
Improve multi-byte character support in Windows
Description
- In *nix,
tabulateuseswcswidthto compute the display width for multi-byte characters - In Windows,
tabulatecomputes the number of multi-byte characters correctly but this is not the same as the "display width", i.e., number of columns to be occupied by the cell contents.
References
- Consider using Markus Kuhn's open-source wcswidth.c
- Output unicode strings in Windows console app
- Use GetTextExtendPoint32A to compute the width of cell contents
reference: https://github.com/nodejs/node/blob/master/lib/internal/readline/utils.js
Did this cause unicode characters to display as ? or scrambled chars for you? I couldn't get windows to display unicode examples even with mutlibyte char support set to true for my table.
@TheMaverickProgrammer I don't have a Windows machine on me right now so this is hard to test. What code page are you using on your Windows console?
There is code page change you can make with regedit to support UTF-8.
There are other accepted responses that point out that chcp 65001 is very dangerous and recommend installing a different console font that has better support.
Lastly, this Microsoft blog post suggests that if you're using the Windows 10 October 2018 Update (build 1809), your console should already support outputting UTF-8 just fine.
And as for this issue, I know the current implementation doesn't handle multi-byte characters correctly and so if you've got your console sorted out (i.e., it renders UTF-8 characters just fine), then the current behavior with tabulate is: "It renders the characters fine but my table isn't aligned."
https://dbj.org/c-windows-unicode-console-output/
Just adding onto this, while windows does support things like printf("┏"); and correctly rendering it in the console, it appears putting that into one of the corner format functions also outputs additional character codes after it in the corners.