AGiXT
AGiXT copied to clipboard
Requirements.txt Updates
Problem Description
We're being negatively impacted by module versions being forced by other modules we're using and by other modules introducing breaking changes with function/class name changes.
Proposed Solution
We need to update our requirements.txt
to force known working versions with our software.
Alternatives Considered
Yelling at everyone else doing this or making breaking changes to their modules. Seems like a waste of time though.
Additional Context
- gpt4free forcing the Streamlit version has broken things for some people.
- DuckDuckGo search updating their module had breaking changes with all functions and class names changed which required code changes in order to restore functionality.
Acknowledgements
- [X] My issue title is concise, descriptive, and in title casing.
- [X] I have searched the existing issues to make sure this feature has not been requested yet.
- [X] I have provided enough information for the maintainers to understand and evaluate this request.
Chill man, you're not in production!
Chill man, you're not in production!
It unfortunately doesn't make maintaining any easier when things I tested yesterday worked but today are broken haha.
Chill man, you're not in production!
Partly, it's me. I want to run it, and it randomly breaks bc of dependencies. Not because of non-working code. This takes a lot of time - each single time to find out if the error is in source or not. Just trying to find a solution. Happy for guidance :)
Maybe search function and read text shouldn't use some web pages api? Maybe is some framework or ready application which will be digest data from web pages without special apis? That will give you more Independence from api providers .
Maybe search function and read text shouldn't use some web pages api? Maybe is some framework or ready application which will be digest data from web pages without special apis? That will give you more Independence from api providers .
You kind of have to pick your battles when building stuff like this. There is always the possibility that I could write any of the modules that I'm using better than the people who wrote them, but there isn't the possibility of me having the time to do so. Happy to take suggestions of other modules to use. We have several search modules available to use, DuckDuckGo is just the default for the sake of privacy and it being free without requiring an API key. You can also use Google's official API or Searx, but they're just additional setup for API keys that no one wants to really do.
I was thinking something like this one ;)
https://stackoverflow.com/questions/1141136/how-can-i-programmatically-perform-a-search-without-using-an-api
To be independent from API
Lots has changed on the internet since 2009, most websites have safeguards against you doing these things now.
Without API
With Selenium, you can write code to control a web browser and interact with web pages.
This allows you to search for text on a web page, or to fill out forms and submit them.
Another way to programmatically perform a search is to use a screen scraping tool.
Screen scraping tools allow you to extract data from a web page without interacting with the browser.
This can be useful if you need to extract data from a web page that does not have an API.
Finally, you can also use a regular expression to search for text on a web page.
Regular expressions are a powerful tool for searching for text that matches a specific pattern.
--------------------------------------------------
Using Selenium:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://www.google.com")
# Search for "home"
search_box = driver.find_element_by_id("q")
search_box.send_keys("home")
# Click the search button
search_button = driver.find_element_by_id("btnK")
search_button.click()
# Get the search results
results = driver.find_elements_by_class_name("g")
# Print the search results
for result in results:
print(result.text)
--------------------------------------------------
Using a screen scraping tool:
Using a screen scraping tool:
Code snippet
import requests
from bs4 import BeautifulSoup
url = "https://www.google.com"
# Make a request to the web page
response = requests.get(url)
# Parse the response as HTML
soup = BeautifulSoup(response.content, "html.parser")
# Find the search results
results = soup.find_all("div", class_="g")
# Print the search results
for result in results:
print(result.text)
--------------------------------------------------
Using a regular expression:
import re
text = """
This is a text with the word "home" in it.
"""
# Find the word "Bard" in the text
match = re.search(r"home", text)
# If the match is found, print the word
if match:
print(match.group())
--------------------------------------------------
Using Puppeteer to search DuckDuckGo:
Code snippet
import puppeteer
browser = puppeteer.launch()
page = browser.new_page()
page.goto('https://duckduckgo.com')
search_box = page.querySelector('#q')
search_box.type('how to programmatically perform a search')
search_button = page.querySelector('#search_button')
search_button.click()
results = page.evaluate('document.querySelectorAll(".result__snippet")')
for result in results:
print(result.textContent)
--------------------------------------------------
Using BeautifulSoup to search Stack Overflow:
Code snippet
import requests
from bs4 import BeautifulSoup
url = 'https://stackoverflow.com/search?q=how+to+programmatically+perform+a+search'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
results = soup.find_all('div', class_='result-card')
for result in results:
title = result.find('a', class_='result-link').text
link = result.find('a', class_='result-link')['href']
print(title, link)
--------------------------------------------------
I found at least 5 tools for it.
Maybe will be helpful ...