autogen Implement User Defined Functions for Local CLI Executor

Why are these changes needed?

User defined functions allow users to define functions in Python and make them available to a code execution environment. The idea of this feature is that AutoGen Studio's skills should be able to ultimately make use of these so that we can get on the page in terms of enhancing code execution with user defined functions.

TL;DR

import os
from pathlib import Path

from autogen.coding import LocalCommandLineCodeExecutor
from autogen.coding.func_with_reqs import with_requirements
from autogen import ConversableAgent

import pandas

@with_requirements(python_packages=["pandas"], global_imports=["pandas"])
def load_data() -> pandas.DataFrame:
    """Load some sample data.

    Returns:
        pandas.DataFrame: A DataFrame with the following columns: name(str), location(str), age(int)
    """
    data = {
        "name": ["John", "Anna", "Peter", "Linda"],
        "location": ["New York", "Paris", "Berlin", "London"],
        "age": [24, 13, 53, 33],
    }
    return pandas.DataFrame(data)

work_dir = Path("coding")
work_dir.mkdir(exist_ok=True)

executor = LocalCommandLineCodeExecutor(work_dir=work_dir, functions=[load_data])

nlnl = "\n\n"
code_writer_system_message = """
You have been given coding capability to solve tasks using Python code.
In the following cases, suggest python code (in a python coding block) or shell script (in a sh coding block) for the user to execute.
    1. When you need to collect info, use the code to output the info you need, for example, browse or search the web, download/read a file, print the content of a webpage or a file, get the current date/time, check the operating system. After sufficient info is printed and the task is ready to be solved based on your language skill, you can solve the task by yourself.
    2. When you need to perform some task with code, use the code to perform the task and output the result. Finish the task smartly.
Solve the task step by step if you need to. If a plan is not provided, explain your plan first. Be clear which step uses code, and which step uses your language skill.
When using code, you must indicate the script type in the code block. The user cannot provide any other feedback or perform any other action beyond executing the code you suggest. The user can't modify your code. So do not suggest incomplete code which requires users to modify. Don't use a code block if it's not intended to be executed by the user.
If you want the user to save the code in a file before executing it, put # filename: <filename> inside the code block as the first line. Don't include multiple code blocks in one response. Do not ask users to copy and paste the result. Instead, use 'print' function for the output when relevant. Check the execution result returned by the user.
"""

# Add on the new functions
code_writer_system_message += executor.format_functions_for_prompt()

code_writer_agent = ConversableAgent(
    "code_writer",
    system_message=code_writer_system_message,
    llm_config={"config_list": [{"model": "gpt-4", "api_key": os.environ["OPENAI_API_KEY"]}]},
    code_execution_config=False,  # Turn off code execution for this agent.
    max_consecutive_auto_reply=2,
    human_input_mode="NEVER",
)

code_executor_agent = ConversableAgent(
    name="code_executor_agent",
    llm_config=False,
    code_execution_config={
        "executor": executor,
    },
    human_input_mode="NEVER",
)

chat_result = code_executor_agent.initiate_chat(
    code_writer_agent,
    message="Please use the load_data function to load the data and please calculate the average age of all people."
)

code_executor_agent (to code_writer):

Please use the load_data function to load the data and please calculate the average age of all people.

--------------------------------------------------------------------------------
code_writer (to code_executor_agent):

Below is the python code to load the data using the `load_data()` function and calculate the average age of all people. 

```python
# python code
from functions import load_data
import numpy as np

# Load the data
df = load_data()

# Calculate the average age
avg_age = np.mean(df['age'])

print("The average age is", avg_age)
```

This code starts by importing the `load_data()` function. It then uses this function to load the data into a variable `df`. Afterwards, it calculates the average (mean) of the 'age' column in the DataFrame, before printing the result.

--------------------------------------------------------------------------------

>>>>>>>> EXECUTING CODE BLOCK (inferred language is python)...
code_executor_agent (to code_writer):

exitcode: 0 (execution succeeded)
Code output: The average age is 30.75


--------------------------------------------------------------------------------
code_writer (to code_executor_agent):

Great! The code worked fine. So, the average age of all people in the dataset is 30.75 years.

Related issue number

#1421

Checks

[ ] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
[ ] I've added tests (if relevant) corresponding to the changes introduced in this PR.
[ ] I've made sure all auto checks have passed.

Mar 20 '24 19:03 jackgerrits

Codecov Report

Attention: Patch coverage is 86.84211% with 15 lines in your changes are missing coverage. Please review.

Project coverage is 56.52%. Comparing base (95c0118) to head (147530a). Report is 1 commits behind head on main.

Files	Patch %	Lines
autogen/coding/func_with_reqs.py	82.60%	10 Missing and 2 partials :warning:
autogen/coding/local_commandline_code_executor.py	93.33%	3 Missing :warning:

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #2102       +/-   ##
===========================================
+ Coverage   37.28%   56.52%   +19.24%     
===========================================
  Files          74       75        +1     
  Lines        7480     7589      +109     
  Branches     1617     1770      +153     
===========================================
+ Hits         2789     4290     +1501     
+ Misses       4450     2944     -1506     
- Partials      241      355      +114

Flag	Coverage Δ
unittests	`56.43% <86.84%> (+19.16%)`	:arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Mar 20 '24 19:03 codecov-commenter

Could the RHS be also a method we provide? But let the user edit the agent's system message (e.g., by appending)

code_writer_system_message += f"""You have access to the following user defined functions. They can be accessed from the module called `{LocalCommandLineCodeExecutor.FUNCTIONS_MODULE}` by their function names.

For example, if there was a function called `foo` you could import it by writing `from {LocalCommandLineCodeExecutor.FUNCTIONS_MODULE} import foo`

{nlnl.join([to_stub(func) for func in executor.functions])}
"""

Mar 20 '24 19:03 gagb

Awesome!

Mar 20 '24 20:03 bassilkhilo

@victordibia can you please take a quick look to see if this can be used to implement skills in autogen studio.

Mar 21 '24 00:03 ekzhu

@ekzhu feedback addressed, should be good to go now

Mar 26 '24 19:03 jackgerrits

@jackgerrits is there a way functions can take a string implementation of the function as input instead of a callable? Two UIs need this feature.

Mar 27 '24 07:03 gagb

@jackgerrits is there a way functions can take a string implementation of the function as input instead of a callable? Two UIs need this feature.

Yeah I plan on adding this in a follow up PR

Mar 27 '24 11:03 jackgerrits

This is looking great @jackgerrits, thanks! For AGS, two things that would be helpful

ability to pass in string versions of the function (similar to @gagb 's request above)
perhaps a callback that helps track when the function was executed. This is helpful in building monitoring/debugging tools. This might be a future PR too.

Mar 27 '24 14:03 victordibia

ability to pass in string versions of the function (similar to @gagb 's request above)

100% agree

perhaps a callback that helps track when the function was executed. This is helpful in building monitoring/debugging tools. This might be a future PR too.

This one is interesting, one approach i can think of now is that the executor decorates the udfs and records an execution log to the working directory that can be read later

Mar 27 '24 16:03 jackgerrits