autogen icon indicating copy to clipboard operation
autogen copied to clipboard

Inconsistent dictionary structure in autogen.ChatCompletion.logged_history

Open victordibia opened this issue 2 years ago • 5 comments

#179 added much needed support for tracking token count and cost (thanks @kevin666aa ). However, there is some unexpected/inconsistent structure in the dictionary returned.

Currently autogen.ChatCompletion.start_logging(compact=True) is used to start a logging session and ends with autogen.ChatCompletion.stop_logging(). Next the logs can be accessed via autogen.ChatCompletion.logged_history.

Unexpected Structure in autogen.ChatCompletion.logged_history when compact=True

When compact is set to True, logged_history is a dictionary. However, the key is the entire chat history


{
    """
    [
        {
            'role': 'system',
            'content': system_message,
        }, ...
    ]""": {
        "created_at": [0, 1],
        "cost": [0.1, 0.2],
    }
}

This makes it very challenging to reuse this data structure in apps. It might be valuable to have output with some structured keys.

{
   "messages": [..]
   "created_at": []
   "cost": ...
}

Further more, the structure of logged_history is significantly different when compact=False

{0: {'request': {'messages': [{'content': 'Y..  ',
     'role': 'system'},
    ],
   'model': 'gpt-4',
   'temperature': 0,
   'api_key': '..'},
  'response': {'id': 'chatcmpl-...Y',
   'object': 'chat.completion',
   'created': 1698203546,
   'model': 'gpt-4-0613',
   'choices': [{'index': 0,
     'message': {'role': 'assistant',
      'content': 'Yes,..'},
     'finish_reason': 'stop'}],
   'usage': {'prompt_tokens': 689,
    'completion_tokens': 65,
    'total_tokens': 754},
   'cost': 0.024569999999999998}}}

Potential action items

  • Improve the key value for logged history compact=True
  • Unify the data data structure across both compact=True and compact=False.

Happy to get more thoughts here @gagb @afourney @pcdeadeasy

Related .

Documentation here may need an update.

victordibia avatar Oct 25 '23 03:10 victordibia

These are great observations. I've mainly only been using compact=False in the Testbed, since the intention is to log as much as is possible.

Even in verbose logging, it's weird to have a dictionary instead of a list, unless we're expecting it to be sparse at some point? Basically it means we need to be a little careful when iterating through the items... there's no guarantee they will be sequential in this structure.

afourney avatar Oct 25 '23 03:10 afourney

@victordibia, all your observations are true. It will be good to define/generate a JSON Schema for logging messages and use it consistently for messages. Having the whole message as a key is indeed an unusual choice. So, I agree with your proposal to make this consistent irrespective of the flag.

pcdeadeasy avatar Oct 25 '23 04:10 pcdeadeasy

Yes, the dictionary returned with compact = True or False is quite different, and I did spend some time to make them consistent to and the func print_usage_summary. It would be great if there is a consistent and easy-to-use structure.

yiranwu0 avatar Oct 25 '23 04:10 yiranwu0

Thanks @kevin666aa . It looks like the new changes driven by updates to the openai lib will be relevant here #203, #7 . I will revisit this when #203 is complete.

victordibia avatar Oct 26 '23 16:10 victordibia

I managed to find a work around that seems to work for me, e.g.:

    conversations = dict()
    autogen.ChatCompletion.start_logging(conversations)

    # ... do AutoGen stuff...

    # extract content from an entry (in the conversations list):
    get_content_list = lambda conversations: list(conversations)[0]
    content_list = get_content_list(conversations)
   
    content_list_json = json.loads(content_list)
    
    # get content of all conversations:
    conversations_content = "\n".join(conversation.get('content') for conversation in content_list_json)

Quite crazy having to do this kind of stuff... I hope it will be fixed soon!

mclassen avatar Nov 25 '23 23:11 mclassen

seems super stale and a workaround is posted. closing won't fix

rysweet avatar Oct 12 '24 02:10 rysweet