[Sample PR] Add Tools for Managing Agent Traces in Hugging Face Datasets

reference: #53

Summary

This PR introduces a foundational implementation for managing and uploading agent traces to Hugging Face datasets. It provides tools to simplify adding traces, maintaining an index dataset for easy retrieval, and enforcing whitelist-based constraints for legality.

Key Features

1. Hugging Face Dataset Structure

Index Dataset: Stores metadata for each trace, allowing easy querying based on attributes.
Trace Dataset: Contains actual zipped trace files, which can be retrieved via pointers from the index.

2. Upload System

Functionality to upload traces one by one.
Automated grouping of traces by study.
Metadata generation, including:
- study_name, llm, benchmark, and license.
- A reference (trace_pointer) to the actual trace file.

Notes

This is a sample template and can be expanded upon.
Future work may include better versioning, enhanced querying capabilities, and automated dataset updates.

Checklist

[x] Upload functionality
[x] Query functionality
[ ] Legal compliance checks
[ ] Documentation

Feb 10 '25 08:02 RohitP2005

Hello @RohitP2005, this looks very interesting, thank you ! Aside from the previous comments, there is a design aspect that needs to be changed. Atm you have 2 tables, one that has experiment metadata, which points to the corresponding zipped experiment content.

Ideally, we would have a third table on top of this, with one entry per study (as in the reproducibility_journal.csv file), with a key. The entries in the experiment metadata table would point to that key. This way we could query per llm/benchmark like you did, but also very importantly per study.

Feb 18 '25 18:02 TLSDC

Hello @RohitP2005, this looks very interesting, thank you ! Aside from the previous comments, there is a design aspect that needs to be changed. Atm you have 2 tables, one that has experiment metadata, which points to the corresponding zipped experiment content.

Ideally, we would have a third table on top of this, with one entry per study (as in the reproducibility_journal.csv file), with a key. The entries in the experiment metadata table would point to that key. This way we could query per llm/benchmark like you did, but also very importantly per study.

Yeah understood , I will look into it as soon as possible

Feb 27 '25 17:02 RohitP2005

Hey @TLSDC @recursix

I thought of restructuring the travel uploads and creating classes for Study and Experiments with methods within them for their functionality. The functions are implemented in the utils files. Also, query functionality has been added.

Kindly refer to Discord for a detailed description.

Feb 28 '25 21:02 RohitP2005