Scrapegraph-ai
Scrapegraph-ai copied to clipboard
NotImplemented error while running in windows.
Describe the bug:
I am trying to develop a web scraper on windows using streamlit and scrapegraph ai. It gives an error:
2024-05-19 12:12:07.798 Uncaught app exception
Traceback (most recent call last):
File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 600, in _run_script
exec(code, module.dict)
File "D:\My Python Projects\SiteScrapers\ai_scraper.py", line 31, in
To Reproduce Steps to reproduce the behavior:
- command: streamlit run ai_scraper.py
Expected behavior I wanted to give an input URL, which it accepts well, also give it information on what I want it to output, and proceed to provide the results.
Desktop (please complete the following information):
- OS: Windows 10 Pro 22H2
- Browser Chrome
- Version 124.0.6367.203
Here is the simple code: file name - ai_scraper.py
import streamlit as st from scrapegraphai.graphs import SmartScraperGraph
st.title('Web Scraping AI Assistant') st.caption('This app allows you to scrape a website using openAI API')
Set up the configuration for the smartscrapegraph graph_config = { "llm" : { "model": "ollama/llama3", "temperature": 0, "format":"json", "base_url":"http://localhost:11434", #set ollama url }, "embeddings": { "model": "ollama/nomic-embed-text", "base_url": "http://localhost:11434", }, "verbose": True, } _ ------ Get the url of the website to scrape ------ _ url = st.text_input("Enter the url of the site you want to scrape") _ -------get the user prompt -------- _ user_prompt = st.text_input("What you want the assistant to scra[e from the ui?")
_ ------ Create a sscmartscrapergraph object ----- _ smart_scraper_graph = SmartScraperGraph(prompt=user_prompt, source=url,config=graph_config) if (st.button("scrape")): result=smart_scraper_graph.run() st.write(result)
Hey try SearchGraph and see if you still have the problem. If no, then it is a asyncio problem in SmartScraper and will fix
I have the same issue
I have the same issue
playwright = AsyncPlaywright(next(iter(done)).result())
raise self._exception.with_traceback(self._exception_tb)
self._proc = await asyncio.create_subprocess_exec(
transport, protocol = await loop.subprocess_exec(
transport = await self._make_subprocess_transport(
raise NotImplementedError
NotImplementedError
edit: I switched from fastapi to flask and It's fixed.
Hey try SearchGraph and see if you still have the problem. If no, then it is a asyncio problem in SmartScraper and will fix
Getting the same error with searchgraph as well. How can this be resolved?
@Shivansh-yadav13 Do you know what was the problem??
Hello @Ravel226 , I'm not sure what exactly the issue is, but whenever I'm using it with a framework I'm getting this error, I was recently trying it with reflex framework and I'm facing the same issue.
When I'm using it with a simple python program it works fine.
I think i will build my own streamlit app from scratch. I hope it will work
HI @all,
Has anybody solved this issue if yes please guide me on it - Thanks.
hi, the app where you can find this problem could be this one https://github.com/ScrapeGraphAI/Scrapegraph-LabLabAI-Hackathon
hi, the app where you can find this problem could be this one https://github.com/ScrapeGraphAI/Scrapegraph-LabLabAI-Hackathon
I have opened an issue there but you also closed it. I don't know where is the problem
I'm facing the same problem, I see it's not resolved yet or a Python file works instead of a Jupyter. But, I see it's a bigger issue for experimenters.
I encountered this issue specifically on Windows OS. The root cause is discussed in detail here: Why am I getting NotImplementedError with async and await on Windows?.
It appears to be a Windows-specific problem. I found a solution by switching to ProactorEventLoop(), which resolves the issue. You can find more details on this workaround here: Using Playwright with Streamlit.
Here’s the code that worked for me:
import streamlit as st
import asyncio
from playwright.async_api import async_playwright
st.write("Starting the test…")
async def main():
async with async_playwright() as p:
browser = await p.chromium.launch()
page = await browser.new_page()
await page.goto("http://playwright.dev")
title = await page.title()
st.write(title)
await browser.close()
return title
if __name__ == '__main__':
loop = asyncio.ProactorEventLoop()
asyncio.set_event_loop(loop)
title = loop.run_until_complete(main())
print(title)
Please note that this solution is only applicable for Windows OS. If you’re running this in a cloud environment with Linux, ensure you install the required dependencies for Playwright by following these steps as outlined in the Playwright documentation:
pip install playwright
playwright install --with-deps
I'm facing the same problem, I see it's not resolved yet or a Python file works instead of a Jupyter. But, I see it's a bigger issue for experimenters.
[Windows 11; Python 3.12] Yes, there is a Jupyter Notebook issue. Running the same script as a ".py" file has no issues.
I'm facing the same problem, I see it's not resolved yet or a Python file works instead of a Jupyter. But, I see it's a bigger issue for experimenters.
[Windows 11; Python 3.12] Yes, there is a Jupyter Notebook issue. Running the same script as a ".py" file has no issues.
Same her, working well on .py files, notebooks still not working.
I'm facing the same problem, I see it's not resolved yet or a Python file works instead of a Jupyter. But, I see it's a bigger issue for experimenters.
[Windows 11; Python 3.12] Yes, there is a Jupyter Notebook issue. Running the same script as a ".py" file has no issues.
Same her, working well on .py files, notebooks still not working.
Just add the following lines on the top of your code using scrapegraph-ai (in python):
import nest_asyncio
nest_asyncio.apply()