autogen
autogen copied to clipboard
Introduce SeleniumBrowser
Key Contributions:
-
SeleniumBrowser Function: Adds a headless, fully-functional desktop web driver to enable dynamic interactions with web pages, crucial for accessing content reliant on JavaScript or other client-side scripts.
-
SeleniumBrowserWrapper Class: Offers a seamless alternative to the
SimpleTextBrowser, enhancing agent capabilities in interacting with web pages that require advanced browsing functionalities. -
WebSurferAgent Enhancements: This update enables the ability to select between
SimpleTextBrowserorSeleniumBrowserWrapperbased on the providedbrowser_configand opens the door to vision-based function calling and interactions, significantly broadening the scope of tasks and potential use cases for Autogen agents in web interaction scenarios going forward. -
Unit Testing: Extends unit tests to cover the enhanced
WebSurferAgent, ensuring both functionality and reliability in the agents' operations. -
Notebooks: Added
agentchat_surfer_edge.ipynbto demonstrate cross compatibility and new graphical functionality Both notebooks specifygpt-3.5-turboand rely on round-trip generations without follow-ons.
Benefits
- Broadens the scope of web-based content that can be collected
- Facilitates future development of sophisticated agents capable of complex web-based vision tasks
- Enhances the WebSurferAgent with configurable browser behavior
Types of changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
Checks
- [ ] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
- [x] I've added tests (if relevant) corresponding to the changes introduced in this PR. need assistance to tell GitHub workflows that this PR requires websurfer extras
- [x] I've made sure all auto checks for
WebArchiverAgenthave passed. - [x] I've made sure all auto checks for
WebSurferAgenthave passed.
Codecov Report
Attention: Patch coverage is 2.32172% with 589 lines in your changes are missing coverage. Please review.
Project coverage is 41.64%. Comparing base (
8ec1c3e) to head (c06f6fd).
Additional details and impacted files
@@ Coverage Diff @@
## main #1733 +/- ##
==========================================
+ Coverage 37.05% 41.64% +4.59%
==========================================
Files 62 63 +1
Lines 6499 7096 +597
Branches 1438 1675 +237
==========================================
+ Hits 2408 2955 +547
- Misses 3898 3905 +7
- Partials 193 236 +43
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 41.61% <2.32%> (+4.56%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@microsoft-github-policy-service agree
On Mon, Feb 19, 2024 at 10:24 PM Codecov Comments Bot < @.***> wrote:
Attention: 396 lines in your changes are missing coverage. Please review.
Comparison is base (2750391) https://app.codecov.io/gh/microsoft/autogen/commit/2750391f847b7168d842dfcb815ac37bd94c9a0e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft 39.33% compared to head (3954412) https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft 21.53%.
Files https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft Patch % Lines autogen/agentchat/contrib/content_agent.py https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft#diff-YXV0b2dlbi9hZ2VudGNoYXQvY29udHJpYi9jb250ZW50X2FnZW50LnB5 0.00% 208 Missing and 3 partials ⚠️ https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft autogen/browser_utils.py https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft#diff-YXV0b2dlbi9icm93c2VyX3V0aWxzLnB5 4.76% 179 Missing and 1 partial ⚠️ https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft autogen/agentchat/contrib/web_surfer.py https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft#diff-YXV0b2dlbi9hZ2VudGNoYXQvY29udHJpYi93ZWJfc3VyZmVyLnB5 44.44% 5 Missing ⚠️ https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft Additional details and impacted files
@@ Coverage Diff @@## main #1733 +/- ## ===========================================- Coverage 39.33% 21.53% -17.81%
Files 57 58 +1 Lines 6096 6502 +406 Branches 1365 1564 +199 ===========================================- Hits 2398 1400 -998 - Misses 3502 4932 +1430 + Partials 196 170 -26
Flag https://app.codecov.io/gh/microsoft/autogen/pull/1733/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft Coverage Δ unittests https://app.codecov.io/gh/microsoft/autogen/pull/1733/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft 21.53% <3.17%> (-17.81%) ⬇️
Flags with carried forward coverage won't be shown. Click here https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft#carryforward-flags-in-the-pull-request-comment to find out more.
☔ View full report in Codecov by Sentry https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft . 📢 Have feedback on the report? Share it here https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft .
— Reply to this email directly, view it on GitHub https://github.com/microsoft/autogen/pull/1733#issuecomment-1953467303, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADWFCMFGKUPKH3OVLMHL573YUQQRDAVCNFSM6AAAAABDQOESUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJTGQ3DOMZQGM . You are receiving this because you were mentioned.Message ID: @.***>
This PR expands the websurfer extras requirements to include selenium. I've also updated the workflow for python-package.yml to include the webdrivers for Edge, Firefox, and Chrome.
All pre-commit tests in the dev docker instance pass, with one exception due to no-commit-to-branch. I realize after the fact that committing to a separate branch would have been preferred and was able to able to commit using SKIP=no-commit-to-branch git commit ....
Finally, regarding the build workflow, there are two pending issues:
- It is still unclear how to inform the system that it should install the
websurferextra requirements for this PR - We're getting numerous
PytestUnknownMarkWarning: Unknown pytest.mark.asyncioerrors, however that markdown is used throughout and I wonder if it's an issue regarding the installed version of pytest. As a side note in reporting, in the dev docker I had to upgrade pydantic to 2.5.3 when it couldn't findfield_validatorin a file unrelated to this PR
EDIT: In the process of building a notebook to demonstrate the elements in this PR and will commit. This has illustrated the need for a few fixes including missing dependencies such as arxiv and requests. Will update as soon as possible.
Updates
- Updated workflows to include the browsers:
--
contrib-openai.ymlandcontrib-tests.yml - Updated
setup.pyso thatwebsurferincludesselenium,requests, andarxivpackages
Improved
- Added additional typing and docstrings to
browser_utils.py - Updates to
SeleniumWrapperensuring 100% coverage and functionality seen inSimpleTextBrowserandWebSurferAgent
Additional Checks
- Confirmed functionality with Bing API
Notebooks
Two new notebooks for our generous reviewers, may be discarded to avoid repo clutter if preferred
notebook/agentchat_content_agent.ipynb- demonstrating core functionalitynotebook/agentchat_surfer_edge.ipynb- demonstrating cross compatibility as well as new graphical functions- Note:
gpt-3.5-turbois specified and calls have 8 token generation limits used for classification of relevant content
Unexpected addition, also can discard
- pre-commit updated unrelated notebooks
notebook/agentchat_lmm_gpt-4v.ipynband addednotebook/agentchat_custom_model.ipynbto my surprise, removal or addition of a empty line(s)
Workflow Builds PASSED
Thank you for the PR! In my opinion, it would be good to include some code examples in the notebook to demonstrate the advantages of the ContentAgent. From what I understand, the ContentAgent is a method for aggregating web information onto disk. However, I’m not sure whether the original assistant and userproxy agents could achieve the same goal through a meticulously designed prompt.
I’m not sure whether the original assistant and userproxy agents could achieve the same goal through a meticulously designed prompt.
Yes, I think you are correct that it could conceptually be handled strictly using existing agents and prompts. I did it this way for two reasons. The first being that this agent works as part of a larger pipeline that will require the archived content, and the second reason is from an energy conservation perspective. Hopefully with recent developments it won't be a concern, but for now we ask if it's responsible to take the tractor (LLM inference) to pickup more cat food or just ride our bike (a scripted series of functions to archive the components). That said, an LLM when done right could potentially handle the corner cases.
The WebSurfer upgrades are a byproduct from the requirements of the WebArchivalAgent (thanks @skzhang1), which is a dependency of my next PR. I could technically remove the Content / WebArchivalAgent components from this PR. I'm open to all ideas, especially what is best for Autogen as a whole.
Made some last updates based on the generous feedback from @skzhang1. The only two other things I can think to include would be an update to the docs for the graphical WebSurferAgent and additional tests to test/test_browser_utils.py.
@sonichi thank you for running the OAI tests. I noticed test_web_surfer.py failed with a NameError about WebSurferAgent being undefined. That test file is unchanged in this PR, but the agent has small changes. Looking into it we do see the agent class imported here with a catch to disable all tests in the case of failure. Your thoughts?
I'd like to get @afourney 's opinion on the structure of the changes, e.g., whether it's ok to directly modify the current web surfer agent or should a separate agent be created. And whether a separate optional dependency is desired. My other comments may not apply if a structural change is needed.
Hi @afourney, I saw in an issue that you were thinking to pass a browser object to the agent. Would you prefer that to the method I have configured? I could expose an option to provide a browser object while maintaining the current behavior where the browser type is specified in the config to keep things simple for the avg end-user. Your thoughts?
⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.
Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.
Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard. Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.
🔎 Detected hardcoded secret in your pull request
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 10404662 | Triggered | Generic CLI Secret | 841ed315e2f19d79a6b86ed587eb6e0fc4a0c0da | .github/workflows/dotnet-release.yml | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
@signalprime i would like you to take a look at my PR maybe we can adopt one design as I’m working on a similar part.
This PR is against AutoGen 0.2. AutoGen 0.2 has been moved to the 0.2 branch. Please rebase your PR on the 0.2 branch or update it to work with the new AutoGen 0.4 that is now in main.
@signalprime unfortunately after rebase there are still some open conflicts. If you are still interested in bringing this one forward please see if you can get those resolved and green the latest CI.
closing as stale, please reopen if you would like to update