Scrapegraph-ai icon indicating copy to clipboard operation
Scrapegraph-ai copied to clipboard

NotImplemented error while running in windows.

Open desainad opened this issue 1 year ago • 5 comments
trafficstars

Describe the bug:

I am trying to develop a web scraper on windows using streamlit and scrapegraph ai. It gives an error: 2024-05-19 12:12:07.798 Uncaught app exception Traceback (most recent call last): File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 600, in _run_script
exec(code, module.dict) File "D:\My Python Projects\SiteScrapers\ai_scraper.py", line 31, in result=smart_scraper_graph.run() File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\scrapegraphai\graphs\smart_scraper_graph.py", line 109, in run self.final_state, self.execution_info = self.graph.execute(inputs) File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\scrapegraphai\graphs\base_graph.py", line 107, in execute result = current_node.execute(state) File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\scrapegraphai\nodes\fetch_node.py", line 88, in execute document = loader.load() File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\langchain_core\document_loaders\base.py", line 29, in load return list(self.lazy_load()) File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\langchain_community\document_loaders\chromium.py", line 76, in lazy_load html_content = asyncio.run(self.ascrape_playwright(url)) File "C:\Python310\lib\asyncio\runners.py", line 44, in run return loop.run_until_complete(main) File "C:\Python310\lib\asyncio\base_events.py", line 646, in run_until_complete return future.result() File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\langchain_community\document_loaders\chromium.py", line 52, in ascrape_playwright async with async_playwright() as p: File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\playwright\async_api_context_manager.py", line 46, in aenter playwright = AsyncPlaywright(next(iter(done)).result()) File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\playwright_impl_transport.py", line 120, in connect self._proc = await asyncio.create_subprocess_exec( File "C:\Python310\lib\asyncio\subprocess.py", line 218, in create_subprocess_exec transport, protocol = await loop.subprocess_exec( File "C:\Python310\lib\asyncio\base_events.py", line 1667, in subprocess_exec transport = await self._make_subprocess_transport( File "C:\Python310\lib\asyncio\base_events.py", line 498, in _make_subprocess_transport raise NotImplementedError NotImplementedError

To Reproduce Steps to reproduce the behavior:

  1. command: streamlit run ai_scraper.py

Expected behavior I wanted to give an input URL, which it accepts well, also give it information on what I want it to output, and proceed to provide the results.

Desktop (please complete the following information):

  • OS: Windows 10 Pro 22H2
  • Browser Chrome
  • Version 124.0.6367.203

Here is the simple code: file name - ai_scraper.py

import streamlit as st from scrapegraphai.graphs import SmartScraperGraph

st.title('Web Scraping AI Assistant') st.caption('This app allows you to scrape a website using openAI API')

Set up the configuration for the smartscrapegraph graph_config = { "llm" : { "model": "ollama/llama3", "temperature": 0, "format":"json", "base_url":"http://localhost:11434", #set ollama url }, "embeddings": { "model": "ollama/nomic-embed-text", "base_url": "http://localhost:11434", }, "verbose": True, } _ ------ Get the url of the website to scrape ------ _ url = st.text_input("Enter the url of the site you want to scrape") _ -------get the user prompt -------- _ user_prompt = st.text_input("What you want the assistant to scra[e from the ui?")

_ ------ Create a sscmartscrapergraph object ----- _ smart_scraper_graph = SmartScraperGraph(prompt=user_prompt, source=url,config=graph_config) if (st.button("scrape")): result=smart_scraper_graph.run() st.write(result)

desainad avatar May 19 '24 06:05 desainad

Hey try SearchGraph and see if you still have the problem. If no, then it is a asyncio problem in SmartScraper and will fix

PeriniM avatar May 19 '24 12:05 PeriniM

I have the same issue

Ravel226 avatar May 20 '24 08:05 Ravel226

I have the same issue

playwright = AsyncPlaywright(next(iter(done)).result())
raise self._exception.with_traceback(self._exception_tb)
self._proc = await asyncio.create_subprocess_exec(
transport, protocol = await loop.subprocess_exec(
transport = await self._make_subprocess_transport(
raise NotImplementedError
NotImplementedError

edit: I switched from fastapi to flask and It's fixed.

Shivansh-yadav13 avatar May 20 '24 09:05 Shivansh-yadav13

Hey try SearchGraph and see if you still have the problem. If no, then it is a asyncio problem in SmartScraper and will fix

Getting the same error with searchgraph as well. How can this be resolved?

desainad avatar May 20 '24 14:05 desainad

@Shivansh-yadav13 Do you know what was the problem??

Ravel226 avatar May 20 '24 14:05 Ravel226

Hello @Ravel226 , I'm not sure what exactly the issue is, but whenever I'm using it with a framework I'm getting this error, I was recently trying it with reflex framework and I'm facing the same issue.

When I'm using it with a simple python program it works fine.

Shivansh-yadav13 avatar May 24 '24 14:05 Shivansh-yadav13

I think i will build my own streamlit app from scratch. I hope it will work

Ravel226 avatar May 24 '24 14:05 Ravel226

HI @all,

Has anybody solved this issue if yes please guide me on it - Thanks.

shami2017 avatar May 26 '24 14:05 shami2017

hi, the app where you can find this problem could be this one https://github.com/ScrapeGraphAI/Scrapegraph-LabLabAI-Hackathon

VinciGit00 avatar Jun 04 '24 08:06 VinciGit00

hi, the app where you can find this problem could be this one https://github.com/ScrapeGraphAI/Scrapegraph-LabLabAI-Hackathon

I have opened an issue there but you also closed it. I don't know where is the problem

Ravel226 avatar Jun 04 '24 15:06 Ravel226

I'm facing the same problem, I see it's not resolved yet or a Python file works instead of a Jupyter. But, I see it's a bigger issue for experimenters.

AliHaider20 avatar Jun 16 '24 11:06 AliHaider20

I encountered this issue specifically on Windows OS. The root cause is discussed in detail here: Why am I getting NotImplementedError with async and await on Windows?.

It appears to be a Windows-specific problem. I found a solution by switching to ProactorEventLoop(), which resolves the issue. You can find more details on this workaround here: Using Playwright with Streamlit.

Here’s the code that worked for me:

import streamlit as st
import asyncio
from playwright.async_api import async_playwright

st.write("Starting the test…")

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page()
        await page.goto("http://playwright.dev")
        title = await page.title()
        st.write(title)
        await browser.close()
        return title

if __name__ == '__main__':
    loop = asyncio.ProactorEventLoop()
    asyncio.set_event_loop(loop)
    title = loop.run_until_complete(main())
    print(title)

Please note that this solution is only applicable for Windows OS. If you’re running this in a cloud environment with Linux, ensure you install the required dependencies for Playwright by following these steps as outlined in the Playwright documentation:

pip install playwright
playwright install --with-deps

AndersonVD avatar Aug 14 '24 14:08 AndersonVD

I'm facing the same problem, I see it's not resolved yet or a Python file works instead of a Jupyter. But, I see it's a bigger issue for experimenters.

[Windows 11; Python 3.12] Yes, there is a Jupyter Notebook issue. Running the same script as a ".py" file has no issues.

datalifenyc avatar Sep 01 '24 14:09 datalifenyc

I'm facing the same problem, I see it's not resolved yet or a Python file works instead of a Jupyter. But, I see it's a bigger issue for experimenters.

[Windows 11; Python 3.12] Yes, there is a Jupyter Notebook issue. Running the same script as a ".py" file has no issues.

Same her, working well on .py files, notebooks still not working.

MatheusRDG avatar Sep 25 '24 15:09 MatheusRDG

I'm facing the same problem, I see it's not resolved yet or a Python file works instead of a Jupyter. But, I see it's a bigger issue for experimenters.

[Windows 11; Python 3.12] Yes, there is a Jupyter Notebook issue. Running the same script as a ".py" file has no issues.

Same her, working well on .py files, notebooks still not working.

Just add the following lines on the top of your code using scrapegraph-ai (in python):

import nest_asyncio nest_asyncio.apply()

Ravel226 avatar Sep 25 '24 20:09 Ravel226