Suggested enhancements post restructure
As we do #229, it will be good to keep track of high-roi, quality of life enhancements.
Problem 1:
We have too many run.py-style files scattered across the codebase. they all do very similar things: parse args, load configs, spin up multiprocessing, init resources, each with some tweaks. Multiple entrypoints will not be maintainable with a package approach unless planned carefully.
- duplicate code across 10+ files
- more effort to update shared logic
- inconsistent interfaces for doing nearly the same things
Proposed fix:
Replace all scattered run.py files with a single unified CLI:
python -m fle.run eval --algorithm independent --config config.json
python -m fle.run server --transport stdio
python -m fle.run data --operation trace --version 330
Also create an Abstract Shared Patterns
Introduce some clean base classes to avoid repeating ourselves:
- BaseRunner: shared structure
- AsyncRunner, MultiprocessRunner: handle async & multiproc setups
- ConfigRunner: deals with JSON config parsing
Problem 2:
Source of truth config scattered across multiple files: Configuration and path management is inconsistent and duplicated throughout the codebase, with no centralized approach for discovering paths or managing environment variables.
Proposed fix:
- ConfigManager:
fle/config.pySingle source of truth for all paths and settings - Environment discovery: Automatic path resolution with env var overrides and sensible defaults Gives us:
- Clear interprocess communication and spawning: All processes point to the same or deterministically hierarchal configs
- Helps with logging: Passed configs will help create context for logs.
- Sets us up for ci/cd: Can have cleaner separation for dev/prod configs.
is this what you are thinking for config.py? https://github.com/JackHopkins/factorio-learning-environment/wiki/c
from pydantic import BaseModel, Field, validator
from typing import Optional, List
from pathlib import Path
class FLEConfig(BaseModel):
# Database
db_type: str = Field(default="sqlite", description="Database type: sqlite or postgres")
sqlite_db_file: Optional[str] = Field(default=".fle/data.db")
# API Keys (optional for basic usage)
openai_api_key: Optional[str] = None
anthropic_api_key: Optional[str] = None
# Docker/Cluster
cluster_name: Optional[str] = None
@validator('db_type')
def validate_db_type(cls, v):
if v not in ['sqlite', 'postgres']:
raise ValueError('db_type must be "sqlite" or "postgres"')
return v
@classmethod
def from_env_file(cls, env_path: str = ".env"):
"""Load config from .env file"""
from dotenv import load_dotenv
import os
load_dotenv(env_path)
return cls(
db_type=os.getenv("FLE_DB_TYPE", "sqlite"),
sqlite_db_file=os.getenv("SQLITE_DB_FILE", ".fle/data.db"),
openai_api_key=os.getenv("OPENAI_API_KEY"),
anthropic_api_key=os.getenv("ANTHROPIC_API_KEY"),
cluster_name=os.getenv("CLUSTER_NAME"),
)
def validate_for_experiment(self) -> List[str]:
"""Return list of missing required fields for experiments"""
missing = []
# For experiments, you need at least one API key
if not self.openai_api_key and not self.anthropic_api_key:
missing.append("At least one API key (OPENAI_API_KEY or ANTHROPIC_API_KEY)")
return missing