composio icon indicating copy to clipboard operation
composio copied to clipboard

Fix SWE accuracy

Open kaavee315 opened this issue 1 year ago • 3 comments

PR Type

Enhancement, Bug fix, Documentation


Description

  • Fixed indexing issue and improved methods in file manager.
  • Enhanced directory change handling with additional checks.
  • Clarified and validated file edit requests.
  • Simplified empty patch response handling.
  • Fixed total lines calculation in file open response.
  • Added unique ID to scroll requests.
  • Simplified agent tool initialization.
  • Fixed syntax issue in benchmark function.
  • Corrected log message typo.
  • Updated log directory path.
  • Enhanced workspace test with logging and assertions.
  • Added installation of composio-core with all extras in dev mode.

Changes walkthrough 📝

Relevant files
Enhancement
8 files
file.py
Improve file handling and linting in file manager               

python/composio/tools/env/filemanager/file.py

  • Fixed indexing issue in read method.
  • Improved total_lines method to use file context manager.
  • Enhanced edit method to handle text replacement and linting.
  • Added detailed lint error parsing and logging.
  • +32/-15 
    manager.py
    Enhance directory change handling in file manager               

    python/composio/tools/env/filemanager/manager.py

  • Enhanced chdir method to handle various path types.
  • Added permission checks for directory navigation.
  • +13/-1   
    chwdir.py
    Improve directory change request handling                               

    python/composio/tools/local/filetool/actions/chwdir.py

  • Updated ChwdirRequest to include detailed path description.
  • Added OSError handling in execute_on_file_manager.
  • +5/-1     
    edit.py
    Clarify and validate file edit requests                                   

    python/composio/tools/local/filetool/actions/edit.py

  • Updated EditFileRequest to clarify line inclusivity.
  • Added file existence check in execute_on_file_manager.
  • +4/-2     
    git_patch.py
    Simplify empty patch response handling                                     

    python/composio/tools/local/filetool/actions/git_patch.py

    • Simplified empty patch response.
    +1/-1     
    scroll.py
    Add unique ID to scroll requests                                                 

    python/composio/tools/local/filetool/actions/scroll.py

    • Added scroll_id to ScrollRequest for unique identification.
    +6/-0     
    agent.py
    Simplify agent tool initialization                                             

    python/swe/examples/crewai_agent/agent.py

  • Removed calculate_operation action.
  • Simplified tool initialization.
  • +6/-32   
    utils.py
    Update log directory path                                                               

    python/swe/swekit/benchmark/utils.py

    • Updated log directory path in main function.
    +1/-1     
    Bug fix
    3 files
    open.py
    Fix total lines calculation in file open response               

    python/composio/tools/local/filetool/actions/open.py

    • Corrected total_lines calculation in OpenFileResponse.
    +1/-1     
    benchmark.py
    Fix syntax issue in benchmark function                                     

    python/swe/examples/crewai_agent/benchmark.py

    • Fixed syntax issue in bench function.
    +1/-1     
    run_evaluation.py
    Fix typo in log message                                                                   

    python/swe/swekit/benchmark/run_evaluation.py

    • Corrected log message typo.
    +1/-1     
    Tests
    1 files
    test_workspace.py
    Enhance workspace test with logging and assertions             

    python/tests/test_tools/test_local/test_workspace.py

  • Unskipped test_workspace.
  • Added logging and assertions for file edit action.
  • +11/-2   
    Configuration changes
    2 files
    entrypoint.sh
    Install composio-core with extras in dev mode                       

    python/dockerfiles/entrypoint.sh

    • Added installation of composio-core with all extras in dev mode.
    +1/-1     
    Dockerfile.dev
    Install composio with extras from source                                 

    python/dockerfiles/Dockerfile.dev

    • Added installation of composio with all extras.
    +1/-1     

    💡 PR-Agent usage: Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    kaavee315 avatar Jul 24 '24 13:07 kaavee315

    PR Reviewer Guide 🔍

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Key issues to review

    Possible Bug
    The modification in the edit method to always append a newline to the text might introduce unwanted newlines in scenarios where the text should not end with a newline. This could affect file formatting and data integrity.

    Performance Issue
    The edit method reads the entire file into memory which could lead to performance issues with very large files. Consider processing the file in chunks or using more efficient file manipulation techniques.

    Security Concern
    The method chdir now includes logic to resolve paths which might inadvertently allow directory traversal if not properly sanitized, especially since it interprets relative paths and parent directory navigations.

    Redundant Code
    The execute_on_file_manager method in chwdir.py has multiple return statements for similar exceptions which could be streamlined into a single exception handler to reduce redundancy and improve maintainability.

    qodo-code-review[bot] avatar Jul 24 '24 13:07 qodo-code-review[bot]

    PR Code Suggestions ✨

    CategorySuggestion                                                                                                                                    Score
    Possible issue
    Validate that end_line is greater than or equal to start_line to avoid negative range edits

    Add a check to ensure that the end_line is not less than the start_line in the edit
    request, which could lead to negative or zero-length edits and potential runtime
    errors.

    python/composio/tools/local/filetool/actions/edit.py [98-99]

     if file is None:
         raise FileNotFoundError(f"File not found: {request_data.file_path}")
    +if request_data.end_line < request_data.start_line:
    +    raise ValueError("end_line must be greater than or equal to start_line")
     
    
    • [ ] Apply this suggestion
    Suggestion importance[1-10]: 10

    Why: This suggestion addresses a potential runtime error by ensuring that the end_line is not less than the start_line, which is crucial for preventing negative or zero-length edits.

    10
    Prevent adding unnecessary newlines to the text

    Replace the direct addition of a newline character to the text variable with a more
    robust check to ensure that text does not already end with a newline. This prevents
    adding unnecessary newlines which could lead to formatting issues in the file.

    python/composio/tools/env/filemanager/file.py [245]

    -text = text + "\n"
    +if not text.endswith('\n'):
    +    text += '\n'
     
    
    • [ ] Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: This suggestion prevents adding unnecessary newlines, which can lead to formatting issues. It is a good practice to ensure that text does not already end with a newline before appending one.

    8
    Possible bug
    Improve error handling in the lint error parsing to prevent index errors

    Modify the error handling in the parse_lint_error function to ensure that it
    correctly handles cases where the error string does not contain enough parts to
    split. This will prevent potential index errors.

    python/composio/tools/env/filemanager/file.py [339-342]

    -error_code = parts[3].split()[0]
    -error_message = ":".join(parts[3:]).strip()
    +if len(parts) > 3:
    +    error_code = parts[3].split()[0]
    +    error_message = ":".join(parts[3:]).strip()
    +else:
    +    error_code = ""
    +    error_message = "Unknown error"
     
    
    • [ ] Apply this suggestion
    Suggestion importance[1-10]: 9

    Why: This suggestion improves the robustness of the error handling by ensuring that the code does not attempt to access parts of the error string that do not exist, thus preventing potential index errors.

    9
    Enhancement
    Ensure the directory is not only present but also accessible

    Add a check to ensure that new_dir is not only a directory but also accessible
    before attempting operations. This adds an additional layer of validation for
    directory accessibility.

    python/composio/tools/env/filemanager/manager.py [105-106]

    -if not new_dir.is_dir():
    -    raise FileNotFoundError(f"'{new_dir}' is not a valid directory.")
    +if not new_dir.is_dir() or not os.access(new_dir, os.R_OK):
    +    raise FileNotFoundError(f"'{new_dir}' is not a valid directory or is not accessible.")
     
    
    • [ ] Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: This suggestion adds an additional layer of validation to ensure that the directory is accessible, which enhances the robustness of the code.

    7

    qodo-code-review[bot] avatar Jul 24 '24 13:07 qodo-code-review[bot]

    CI Failure Feedback 🧐

    (Checks updated until commit https://github.com/ComposioHQ/composio/commit/11ca5e35e18a9bd7e68835c6901d09f17986bb06)

    Action: test (ubuntu-latest, 3.9)

    Failed stage: Unittests [❌]

    Failed test name: tests/test_tools/test_local/test_workspace.py

    Failure summary:

    The action failed because there was an error during the collection of tests.

  • The specific error occurred in the file tests/test_tools/test_local/test_workspace.py.
  • The error was related to a deprecation warning in the paramiko and pydantic libraries.
  • The TripleDES algorithm in paramiko is deprecated and will be removed in a future version.
  • The config class in pydantic is deprecated and should be replaced with ConfigDict.
  • Relevant error logs:
    1:  ##[group]Operating System
    2:  Ubuntu
    ...
    
    503:  * [new branch]        fix/rag-agent                            -> origin/fix/rag-agent
    504:  * [new branch]        fix/readme                               -> origin/fix/readme
    505:  * [new branch]        fix/readme-logo                          -> origin/fix/readme-logo
    506:  * [new branch]        fix/swe-agent                            -> origin/fix/swe-agent
    507:  * [new branch]        ft-add-better-help-text                  -> origin/ft-add-better-help-text
    508:  * [new branch]        ft-apps-id                               -> origin/ft-apps-id
    509:  * [new branch]        ft-bring-back-core-sdk                   -> origin/ft-bring-back-core-sdk
    510:  * [new branch]        ft-did-you-mean                          -> origin/ft-did-you-mean
    511:  * [new branch]        ft-error-tracking                        -> origin/ft-error-tracking
    ...
    
    911:  ⚠️ Actions does not require update
    912:  ⚠️ Triggers does not require update
    913:  unittests: commands[1]> pytest -vvv -rfE --doctest-modules composio/ tests/ --cov=composio --cov=examples --cov-report=html --cov-report=xml --cov-report=term --cov-report=term-missing --cov-config=.coveragerc
    914:  ============================= test session starts ==============================
    915:  platform linux -- Python 3.9.19, pytest-7.4.2, pluggy-1.5.0 -- /home/runner/work/composio/composio/python/.tox/unittests/bin/python
    916:  cachedir: .tox/unittests/.pytest_cache
    917:  rootdir: /home/runner/work/composio/composio/python
    918:  plugins: codecov-0.5.1, anyio-4.4.0, cov-5.0.0
    919:  collecting ... collected 44 items / 1 error / 1 skipped
    920:  ==================================== ERRORS ====================================
    921:  ________ ERROR collecting tests/test_tools/test_local/test_workspace.py ________
    ...
    
    927:  .tox/unittests/lib/python3.9/site-packages/paramiko/pkey.py:100
    928:  /home/runner/work/composio/composio/python/.tox/unittests/lib/python3.9/site-packages/paramiko/pkey.py:100: CryptographyDeprecationWarning: TripleDES has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.TripleDES and will be removed from this module in 48.0.0.
    929:  "cipher": algorithms.TripleDES,
    930:  .tox/unittests/lib/python3.9/site-packages/paramiko/transport.py:259
    931:  /home/runner/work/composio/composio/python/.tox/unittests/lib/python3.9/site-packages/paramiko/transport.py:259: CryptographyDeprecationWarning: TripleDES has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.TripleDES and will be removed from this module in 48.0.0.
    932:  "class": algorithms.TripleDES,
    933:  .tox/unittests/lib/python3.9/site-packages/pydantic/_internal/_config.py:291
    934:  .tox/unittests/lib/python3.9/site-packages/pydantic/_internal/_config.py:291
    935:  /home/runner/work/composio/composio/python/.tox/unittests/lib/python3.9/site-packages/pydantic/_internal/_config.py:291: PydanticDeprecatedSince20: Support for class-based `config` is deprecated, use ConfigDict instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.8/migration/
    ...
    
    1145:  examples/Podcast_summarizer_Agents/Tools/composio_slack.py                           4      4     0%   1-5
    1146:  examples/Podcast_summarizer_Agents/__init__.py                                       0      0   100%
    1147:  examples/Podcast_summarizer_Agents/main.py                                          15     15     0%   1-21
    1148:  --------------------------------------------------------------------------------------------------------------
    1149:  TOTAL                                                                            10748   3588    67%
    1150:  Coverage HTML written to dir htmlcov
    1151:  Coverage XML written to file coverage.xml
    1152:  =========================== short test summary info ============================
    1153:  ERROR tests/test_tools/test_local/test_workspace.py
    1154:  !!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!
    1155:  =================== 1 skipped, 5 warnings, 1 error in 7.24s ====================
    1156:  unittests: exit 2 (11.71 seconds) /home/runner/work/composio/composio/python> pytest -vvv -rfE --doctest-modules composio/ tests/ --cov=composio --cov=examples --cov-report=html --cov-report=xml --cov-report=term --cov-report=term-missing --cov-config=.coveragerc pid=5558
    1157:  .pkg: _exit> python /opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/pyproject_api/_backend.py True setuptools.build_meta __legacy__
    1158:  unittests: FAIL code 2 (41.50=setup[24.43]+cmd[5.36,11.71] seconds)
    1159:  evaluation failed :( (42.17 seconds)
    1160:  ##[error]Process completed with exit code 2.
    
    

    ✨ CI feedback usage guide:

    The CI feedback tool (/checks) automatically triggers when a PR has a failed check. The tool analyzes the failed checks and provides several feedbacks:

    • Failed stage
    • Failed test name
    • Failure summary
    • Relevant error logs

    In addition to being automatically triggered, the tool can also be invoked manually by commenting on a PR:

    /checks "https://github.com/{repo_name}/actions/runs/{run_number}/job/{job_number}"
    

    where {repo_name} is the name of the repository, {run_number} is the run number of the failed check, and {job_number} is the job number of the failed check.

    Configuration options

    • enable_auto_checks_feedback - if set to true, the tool will automatically provide feedback when a check is failed. Default is true.
    • excluded_checks_list - a list of checks to exclude from the feedback, for example: ["check1", "check2"]. Default is an empty list.
    • enable_help_text - if set to true, the tool will provide a help message with the feedback. Default is true.
    • persistent_comment - if set to true, the tool will overwrite a previous checks comment with the new feedback. Default is true.
    • final_update_message - if persistent_comment is true and updating a previous checks message, the tool will also create a new message: "Persistent checks updated to latest commit". Default is true.

    See more information about the checks tool in the docs.

    qodo-code-review[bot] avatar Jul 24 '24 13:07 qodo-code-review[bot]