ydata-profiling icon indicating copy to clipboard operation
ydata-profiling copied to clipboard

How to display Chinese characters correctly in pandas profiling?

Open hengzhe-zhang opened this issue 5 years ago • 12 comments
trafficstars

I find that when I render non-ASCII characters, pandas profiling will not render them correctly. I have tried to modify the default rendering font of matplotlib. However, even though I can manually render a correct figure by using matplotlib, the figure rendered by the pandas profiler is still wrong. How can I solve this problem?

hengzhe-zhang avatar Sep 17 '20 09:09 hengzhe-zhang

Hello @zhenlingcn,

Matplotlib unfortunately doesn't support Chinese characters by default. This can be overcome by changing the font.

See: https://jdhao.github.io/2017/05/13/guide-on-how-to-use-chinese-with-matplotlib/ https://medium.com/@hoishing/using-chinese-characters-in-matplotlib-5c49dbb6a2f7

If anyone is interested in adding this as a feature/config param, then a pull request is welcome.

Same as: https://github.com/pandas-profiling/pandas-profiling/issues/572

sbrugman avatar Sep 17 '20 16:09 sbrugman

I have configured matplotlib correctly. However, it seems like pandas profiling does not use the default configuration of matplotlib.

hengzhe-zhang avatar Sep 18 '20 01:09 hengzhe-zhang

@zhenlingcn Could you share that config? Then we'll try to find a solution.

sbrugman avatar Sep 18 '20 20:09 sbrugman

Update: pandas-profiling v3 will use Altair which is not expected to have character issues.

sbrugman avatar Sep 19 '20 17:09 sbrugman

Update: pandas-profiling v3 will use Altair which is not expected to have character issues.

With all due respect, when will the version 3 be released?

Blanket58 avatar Dec 28 '20 07:12 Blanket58

I meet the same question, I have changed the config and it show right if only use matplotlib, but pandas profiling doesn't work @sbrugman

lk137095576 avatar Aug 25 '21 13:08 lk137095576

我在我的服务器上解决了这个问题,不过方法有点蠢,只适用于硬性修改字体,不适用想配置字体的同学。 遇到这个问题的同学可以参考一下。只需要修改python包管理的中的pandas_profiling/visualisation下的plot.py 文件和missing.py `文件,在每一个绘图的function内部开头,都添加两行代码

    mpl.rcParams['font.sans-serif'] = ['SimHei']   #显示中文
    mpl.rcParams['axes.unicode_minus']=False       #显示负号

在文件开头导入包from matplotlib.pylab import mpl 就能够解决这个问题,画图就可以正常输出中文。 办法比较蠢- -如果有更好的方式请告诉我= = @lk137095576


English Version:

I solved this problem on my server, but the method is a bit silly and only applies to hard changes to fonts, not to guys who want to configure fonts. Just modify the plot.py file and the missing.py ` file under pandas_profiling/visualisation in the python package manager, and add two lines of code to the beginning of each function inside the plot

    mpl.rcParams['font.sans-serif'] = ['SimHei'] # show Chinese
    mpl.rcParams['axes.unicode_minus'] = False # show negative signs

Import the package from matplotlib.pylab import mpl at the beginning of the file This will solve the problem and the drawing will output Chinese properly. It's a stupid solution - - if there is a better way please let me know , thks

gsy44355 avatar Aug 26 '21 09:08 gsy44355

Is there an update? I cannot implement this solution if using remote (cloud) resources like Google Colab.

我在我的服务器上解决了这个问题,不过方法有点蠢,只适用于硬性修改字体,不适用想配置字体的同学。 遇到这个问题的同学可以参考一下。只需要修改python包管理的中的pandas_profiling/visualisation下的plot.py 文件和missing.py `文件,在每一个绘图的function内部开头,都添加两行代码

    mpl.rcParams['font.sans-serif'] = ['SimHei']   #显示中文
    mpl.rcParams['axes.unicode_minus']=False       #显示负号

在文件开头导入包from matplotlib.pylab import mpl 就能够解决这个问题,画图就可以正常输出中文。 办法比较蠢- -如果有更好的方式请告诉我= = @lk137095576

English Version:

I solved this problem on my server, but the method is a bit silly and only applies to hard changes to fonts, not to guys who want to configure fonts. Just modify the plot.py file and the missing.py ` file under pandas_profiling/visualisation in the python package manager, and add two lines of code to the beginning of each function inside the plot

    mpl.rcParams['font.sans-serif'] = ['SimHei'] # show Chinese
    mpl.rcParams['axes.unicode_minus'] = False # show negative signs

Import the package from matplotlib.pylab import mpl at the beginning of the file This will solve the problem and the drawing will output Chinese properly. It's a stupid solution - - if there is a better way please let me know , thks

cloudy-sfu avatar Dec 03 '21 08:12 cloudy-sfu

我在我的服务器上解决了这个问题,不过方法有点蠢,只适用于硬性修改字体,不适用想配置字体的同学。 遇到这个问题的同学可以参考一下。只需要修改python包管理的中的pandas_profiling/visualisation下的plot.py 文件和missing.py `文件,在每一个绘图的function内部开头,都添加两行代码

    mpl.rcParams['font.sans-serif'] = ['SimHei']   #显示中文
    mpl.rcParams['axes.unicode_minus']=False       #显示负号

在文件开头导入包from matplotlib.pylab import mpl 就能够解决这个问题,画图就可以正常输出中文。 办法比较蠢- -如果有更好的方式请告诉我= = @lk137095576

English Version:

I solved this problem on my server, but the method is a bit silly and only applies to hard changes to fonts, not to guys who want to configure fonts. Just modify the plot.py file and the missing.py ` file under pandas_profiling/visualisation in the python package manager, and add two lines of code to the beginning of each function inside the plot

    mpl.rcParams['font.sans-serif'] = ['SimHei'] # show Chinese
    mpl.rcParams['axes.unicode_minus'] = False # show negative signs

Import the package from matplotlib.pylab import mpl at the beginning of the file This will solve the problem and the drawing will output Chinese properly. It's a stupid solution - - if there is a better way please let me know , thks

it does not work for me

JonyJiang123 avatar May 18 '23 06:05 JonyJiang123

Add a font_path parameter in plot.py , that's ok for me : wordcloud = WordCloud( background_color="white", random_state=123, width=300, height=200, scale=2, font_path='path_to_chinese_font.ttf' # Replace 'path_to_chinese_font.ttf' with the path to a font supporting Chinese characters ).generate_from_frequencies(word_dict)

whxue0807 avatar Nov 23 '23 08:11 whxue0807