helpers
helpers copied to clipboard
Schema Parser for AutoEDA Agent
Implement a module to parse a schema file (JSON or YAML) for AutoEDA, extracting:
- Column names
- Declared data types
- Metadata: primary keys, time indices, target variables
Input:
- Schema file (JSON or YAML) describing dataset structure and types
Output:
- Python dictionary with:
columns: list of column namesdata_types: dict mapping columns to typesmetadata: dict with keys for primary keys, time indices, target variables
Requirements:
- Validate presence of required fields and handle missing/ambiguous cases with clear error messages.
- Keep implementation modular for integration with agentic loop and notebook generation using LangGraph
@gpsaggese @tkpratardan