requests-html icon indicating copy to clipboard operation
requests-html copied to clipboard

how to render javascript in flask endpoint ?

Open BenjaminSchmitt opened this issue 5 years ago • 5 comments

Hi, I would like to render JavaScript inside a Flask endpoint. The problem is that in a multithreaded environment, the page is not rendered (due to nested threading, if I'm right).

Here is a little code to illustrate the problem.

import asyncio
import json

from flask import Flask, Response
from requests_html import AsyncHTMLSession

app = Flask(__name__)


async def get_pyclock():
    asession = AsyncHTMLSession()
    r = await asession.get('https://pythonclock.org/')
    await r.html.arender()
    periods = [element.text for element in r.html.find('.countdown-period')]
    amounts = [element.text for element in r.html.find('.countdown-amount')]
    countdown_data = dict(zip(periods, amounts))
    return countdown_data


@app.route("/", methods=['GET'])
def test():
    loop = asyncio.new_event_loop()
    results = loop.run_until_complete(get_pyclock())
    return Response(json.dumps(results), 200)


# app.run(host="127.0.0.1", port="8080")  # 'ValueError: signal only works in main thread'
app.run(host="127.0.0.1", port="8080", threaded=False)  # works

Is there a way to keep threading in Flask and correctly render a page ?

Many thanks.

BenjaminSchmitt avatar Apr 29 '19 09:04 BenjaminSchmitt

Any help please ?

BenjaminSchmitt avatar May 14 '19 11:05 BenjaminSchmitt

I have the same issue.

jasonniebauer avatar May 20 '19 01:05 jasonniebauer

This issue tipped me off to a possible fix: https://github.com/psf/requests-html/issues/326. The root of the error is that pyppeteer is trying to send an exit signal to flask's secondary thread, which is blocked by the runtime. To workaround, we can stop pyppeteer from sending these signals in the first place.

import asyncio
from flask import Flask, Response
from requests_html import HTMLSession

class AsyncHTMLSessionFixed(AsyncHTMLSession):
    """
    pip3 install websockets==6.0 --force-reinstall
    """
    def __init__(self, **kwargs):
        super(AsyncHTMLSessionFixed, self).__init__(**kwargs)
        self.__browser_args = kwargs.get("browser_args", ["--no-sandbox"])

    @property
    async def browser(self):
        if not hasattr(self, "_browser"):
            self._browser = await pyppeteer.launch(ignoreHTTPSErrors=not(self.verify), headless=True, handleSIGINT=False, handleSIGTERM=False, handleSIGHUP=False, args=self.__browser_args)

        return self._browser

app = Flask(__name__)

async def get_pyclock():
    asession = AsyncHTMLSession()
    r = await asession.get('https://pythonclock.org/')
    await r.html.arender()
    periods = [element.text for element in r.html.find('.countdown-period')]
    amounts = [element.text for element in r.html.find('.countdown-amount')]
    countdown_data = dict(zip(periods, amounts))
    return countdown_data

@app.route("/", methods=['GET'])
def test():
    loop = asyncio.new_event_loop()
    results = loop.run_until_complete(get_pyclock())
    return Response(json.dumps(results), 200)

app.run(host="127.0.0.1", port="8080")

piercefreeman avatar Sep 29 '19 16:09 piercefreeman

Have the same issue, my endpoint calls a function that renders the javascript of a page and i get the same error. more details on my error at stackoverflow: link

xChapx avatar Oct 28 '19 05:10 xChapx

My solution:

  1. find function browser
  2. replace "self.browser = ......" with "self._browser = await pyppeteer.launch(ignoreHTTPSErrors=not(self.verify), headless=True, handleSIGINT=False, handleSIGTERM=False, handleSIGHUP=False, args=self.__browser_args)". It works for me.

Hokage-Itachi avatar Apr 22 '21 07:04 Hokage-Itachi