Andy
Andy
Hey @sanchit-gandhi. Due to other commitments, I currently don't have bandwidth to continue this. And the timeline for me to get to this unknown right now. If someone else wants...
What manual tests need to be done to confirm the correctness of this?
This logic has not been tested to skip entire benchmarks. I only tested to see if it skipped Server and Offline scenarios.
Yes, that would be great! I don't have permission to create; would you be able to assist? something like a `adas` label should be fine for now I think.
Thanks!
> If the bucket exists, the script will continue and use the existing ones IIRC Yes, but only if the current `USER` is the original creator/owner of bucket, right? A...
For example, if mlperf_log_summary.txt reports 1 sample/sec and 100 tokens/sec for Server, and reports 1.1 sample/sec and 110 tokens/sec for Offline, the submission checker will produce summary.csv with: - Server...
Can you give an example of how/why the agent gets stuck in a loop? Is it when the LLM's response always contains a `FunctionCall` in the content of its results?...
This PR is only enabling the o1 model for AutoGen 0.2, right? The counterpart for AutoGen 0.4 is this one https://github.com/microsoft/autogen/issues/3884, right? (which I imagine is a bit more complicated...
Do we know if this PR get merged in soon? Maybe in tomorrow's meeting? Otherwise, it will delay people completing their compliance runs in a timely manner.