spidermon icon indicating copy to clipboard operation
spidermon copied to clipboard

Spidermon's FileReport does not support Chinese

Open bytebuff opened this issue 4 years ago • 2 comments

Hi, I'm creating a report using spidermon's CreateFileReport:

class SpiderCloseMonitorSuite(MonitorSuite):
    monitors = [ItemCountMonitor, ItemValidationMonitor, PeriodicJobStatsMonitor]

    monitors_finished_actions = [CreateFileReport]

But it turns out that Chinese is not supported,I know this is caused by jinja2。So I changed the encoding in the default template。utf-8 -> gb18030

Python37\Lib\site-packages\spidermon\contrib\actions\reports\templates\reports\email\bases\report\base.jinja

<html>
    <head>
        <meta charset="gb18030">
        <style>
            {% include 'reports/email/bases/report/email.css' %}
            {% include 'reports/email/bases/report/report.css' %}
            {% block page_styles %}
            {% endblock %}
        </style>
    </head>
    <body>
        <table class="report-container" width="100%" border="0" cellspacing="0" cellpadding="0">
            <tr>
                <td align="center">
                    <div class="report {% block page_class %}{% endblock %}">
                        {% block page_content %}{% endblock %}
                    </div>
                    <div class="report-footer">
                        Report timestamp: {{ datetime.datetime.utcnow().strftime('%x %X UTC') }}
                    </div>
                </td>
            </tr>
        </table>
    </body>
</html>

then it works。

bytebuff avatar Jul 03 '20 07:07 bytebuff

Are you sure you are writing your Chinese text in UTF-8? Any chance you can provide a minimal, reproducible example?

Gallaecio avatar Jul 06 '20 09:07 Gallaecio

So I changed the encoding in the default template。utf-8 -> gb18030

It is expected that if you decide to use GB18030 (or any other non-UTF-8 encoding), you'll need to have the full chain (content, template, code, browser, etc.) configured in this specific encoding as well.

Similar to Scrapy or many Python libraries, Spidermon by default supports UTF-8 only. I assume that it shall work properly if your content in Chinese (or any other language) is prepared in UTF-8.

starrify avatar Jul 06 '20 09:07 starrify