autogen icon indicating copy to clipboard operation
autogen copied to clipboard

Introduce SeleniumBrowser

Open signalprime opened this issue 1 year ago • 10 comments

Key Contributions:

  • SeleniumBrowser Function: Adds a headless, fully-functional desktop web driver to enable dynamic interactions with web pages, crucial for accessing content reliant on JavaScript or other client-side scripts.

  • SeleniumBrowserWrapper Class: Offers a seamless alternative to the SimpleTextBrowser, enhancing agent capabilities in interacting with web pages that require advanced browsing functionalities.

  • WebSurferAgent Enhancements: This update enables the ability to select between SimpleTextBrowser or SeleniumBrowserWrapper based on the provided browser_config and opens the door to vision-based function calling and interactions, significantly broadening the scope of tasks and potential use cases for Autogen agents in web interaction scenarios going forward.

  • Unit Testing: Extends unit tests to cover the enhanced WebSurferAgent, ensuring both functionality and reliability in the agents' operations.

  • Notebooks: Added agentchat_surfer_edge.ipynb to demonstrate cross compatibility and new graphical functionality Both notebooks specify gpt-3.5-turbo and rely on round-trip generations without follow-ons.

Benefits

  • Broadens the scope of web-based content that can be collected
  • Facilitates future development of sophisticated agents capable of complex web-based vision tasks
  • Enhances the WebSurferAgent with configurable browser behavior

Types of changes

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)

Checks

  • [ ] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
  • [x] I've added tests (if relevant) corresponding to the changes introduced in this PR. need assistance to tell GitHub workflows that this PR requires websurfer extras
  • [x] I've made sure all auto checks for WebArchiverAgent have passed.
  • [x] I've made sure all auto checks for WebSurferAgent have passed.

signalprime avatar Feb 20 '24 04:02 signalprime

Codecov Report

Attention: Patch coverage is 2.32172% with 589 lines in your changes are missing coverage. Please review.

Project coverage is 41.64%. Comparing base (8ec1c3e) to head (c06f6fd).

Files Patch % Lines
autogen/browser_utils.py 2.90% 333 Missing and 1 partial :warning:
autogen/agentchat/contrib/web_archiver_agent.py 0.00% 215 Missing and 3 partials :warning:
autogen/agentchat/contrib/web_surfer.py 9.75% 37 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1733      +/-   ##
==========================================
+ Coverage   37.05%   41.64%   +4.59%     
==========================================
  Files          62       63       +1     
  Lines        6499     7096     +597     
  Branches     1438     1675     +237     
==========================================
+ Hits         2408     2955     +547     
- Misses       3898     3905       +7     
- Partials      193      236      +43     
Flag Coverage Δ
unittests 41.61% <2.32%> (+4.56%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Feb 20 '24 04:02 codecov-commenter

@microsoft-github-policy-service agree

On Mon, Feb 19, 2024 at 10:24 PM Codecov Comments Bot < @.***> wrote:

Codecov https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft Report

Attention: 396 lines in your changes are missing coverage. Please review.

Comparison is base (2750391) https://app.codecov.io/gh/microsoft/autogen/commit/2750391f847b7168d842dfcb815ac37bd94c9a0e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft 39.33% compared to head (3954412) https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft 21.53%.

Files https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft Patch % Lines autogen/agentchat/contrib/content_agent.py https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft#diff-YXV0b2dlbi9hZ2VudGNoYXQvY29udHJpYi9jb250ZW50X2FnZW50LnB5 0.00% 208 Missing and 3 partials ⚠️ https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft autogen/browser_utils.py https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft#diff-YXV0b2dlbi9icm93c2VyX3V0aWxzLnB5 4.76% 179 Missing and 1 partial ⚠️ https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft autogen/agentchat/contrib/web_surfer.py https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft#diff-YXV0b2dlbi9hZ2VudGNoYXQvY29udHJpYi93ZWJfc3VyZmVyLnB5 44.44% 5 Missing ⚠️ https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft Additional details and impacted files

@@ Coverage Diff @@## main #1733 +/- ## ===========================================- Coverage 39.33% 21.53% -17.81%

Files 57 58 +1 Lines 6096 6502 +406 Branches 1365 1564 +199 ===========================================- Hits 2398 1400 -998 - Misses 3502 4932 +1430 + Partials 196 170 -26

Flag https://app.codecov.io/gh/microsoft/autogen/pull/1733/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft Coverage Δ unittests https://app.codecov.io/gh/microsoft/autogen/pull/1733/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft 21.53% <3.17%> (-17.81%) ⬇️

Flags with carried forward coverage won't be shown. Click here https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft#carryforward-flags-in-the-pull-request-comment to find out more.

☔ View full report in Codecov by Sentry https://app.codecov.io/gh/microsoft/autogen/pull/1733?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft . 📢 Have feedback on the report? Share it here https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=microsoft .

— Reply to this email directly, view it on GitHub https://github.com/microsoft/autogen/pull/1733#issuecomment-1953467303, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADWFCMFGKUPKH3OVLMHL573YUQQRDAVCNFSM6AAAAABDQOESUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJTGQ3DOMZQGM . You are receiving this because you were mentioned.Message ID: @.***>

signalprime avatar Feb 20 '24 04:02 signalprime

This PR expands the websurfer extras requirements to include selenium. I've also updated the workflow for python-package.yml to include the webdrivers for Edge, Firefox, and Chrome.

All pre-commit tests in the dev docker instance pass, with one exception due to no-commit-to-branch. I realize after the fact that committing to a separate branch would have been preferred and was able to able to commit using SKIP=no-commit-to-branch git commit ....

Finally, regarding the build workflow, there are two pending issues:

  1. It is still unclear how to inform the system that it should install the websurfer extra requirements for this PR
  2. We're getting numerous PytestUnknownMarkWarning: Unknown pytest.mark.asyncio errors, however that markdown is used throughout and I wonder if it's an issue regarding the installed version of pytest. As a side note in reporting, in the dev docker I had to upgrade pydantic to 2.5.3 when it couldn't find field_validator in a file unrelated to this PR

EDIT: In the process of building a notebook to demonstrate the elements in this PR and will commit. This has illustrated the need for a few fixes including missing dependencies such as arxiv and requests. Will update as soon as possible.

signalprime avatar Feb 20 '24 09:02 signalprime

Updates

  • Updated workflows to include the browsers: -- contrib-openai.yml and contrib-tests.yml
  • Updated setup.py so that websurfer includes selenium, requests, and arxiv packages

Improved

  • Added additional typing and docstrings to browser_utils.py
  • Updates to SeleniumWrapper ensuring 100% coverage and functionality seen in SimpleTextBrowser and WebSurferAgent

Additional Checks

  • Confirmed functionality with Bing API

Notebooks

Two new notebooks for our generous reviewers, may be discarded to avoid repo clutter if preferred

  • notebook/agentchat_content_agent.ipynb - demonstrating core functionality
  • notebook/agentchat_surfer_edge.ipynb - demonstrating cross compatibility as well as new graphical functions
  • Note: gpt-3.5-turbo is specified and calls have 8 token generation limits used for classification of relevant content

Unexpected addition, also can discard

  • pre-commit updated unrelated notebooks notebook/agentchat_lmm_gpt-4v.ipynb and added notebook/agentchat_custom_model.ipynb to my surprise, removal or addition of a empty line(s)

Workflow Builds PASSED

signalprime avatar Feb 22 '24 06:02 signalprime

Thank you for the PR! In my opinion, it would be good to include some code examples in the notebook to demonstrate the advantages of the ContentAgent. From what I understand, the ContentAgent is a method for aggregating web information onto disk. However, I’m not sure whether the original assistant and userproxy agents could achieve the same goal through a meticulously designed prompt.

skzhang1 avatar Feb 24 '24 22:02 skzhang1

I’m not sure whether the original assistant and userproxy agents could achieve the same goal through a meticulously designed prompt.

Yes, I think you are correct that it could conceptually be handled strictly using existing agents and prompts. I did it this way for two reasons. The first being that this agent works as part of a larger pipeline that will require the archived content, and the second reason is from an energy conservation perspective. Hopefully with recent developments it won't be a concern, but for now we ask if it's responsible to take the tractor (LLM inference) to pickup more cat food or just ride our bike (a scripted series of functions to archive the components). That said, an LLM when done right could potentially handle the corner cases.

The WebSurfer upgrades are a byproduct from the requirements of the WebArchivalAgent (thanks @skzhang1), which is a dependency of my next PR. I could technically remove the Content / WebArchivalAgent components from this PR. I'm open to all ideas, especially what is best for Autogen as a whole.

signalprime avatar Feb 24 '24 23:02 signalprime

Made some last updates based on the generous feedback from @skzhang1. The only two other things I can think to include would be an update to the docs for the graphical WebSurferAgent and additional tests to test/test_browser_utils.py.

@sonichi thank you for running the OAI tests. I noticed test_web_surfer.py failed with a NameError about WebSurferAgent being undefined. That test file is unchanged in this PR, but the agent has small changes. Looking into it we do see the agent class imported here with a catch to disable all tests in the case of failure. Your thoughts?

signalprime avatar Feb 25 '24 21:02 signalprime

I'd like to get @afourney 's opinion on the structure of the changes, e.g., whether it's ok to directly modify the current web surfer agent or should a separate agent be created. And whether a separate optional dependency is desired. My other comments may not apply if a structural change is needed.

Hi @afourney, I saw in an issue that you were thinking to pass a browser object to the agent. Would you prefer that to the method I have configured? I could expose an option to provide a browser object while maintaining the current behavior where the browser type is specified in the config to keep things simple for the avg end-user. Your thoughts?

signalprime avatar Feb 29 '24 05:02 signalprime

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard. Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
10404662 Triggered Generic CLI Secret 841ed315e2f19d79a6b86ed587eb6e0fc4a0c0da .github/workflows/dotnet-release.yml View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

gitguardian[bot] avatar Jul 20 '24 21:07 gitguardian[bot]

@signalprime i would like you to take a look at my PR maybe we can adopt one design as I’m working on a similar part.

MohammedNagdy avatar Aug 01 '24 11:08 MohammedNagdy

This PR is against AutoGen 0.2. AutoGen 0.2 has been moved to the 0.2 branch. Please rebase your PR on the 0.2 branch or update it to work with the new AutoGen 0.4 that is now in main.

rysweet avatar Oct 10 '24 21:10 rysweet

@signalprime unfortunately after rebase there are still some open conflicts. If you are still interested in bringing this one forward please see if you can get those resolved and green the latest CI.

rysweet avatar Oct 11 '24 22:10 rysweet

closing as stale, please reopen if you would like to update

rysweet avatar Oct 18 '24 18:10 rysweet