Suggested enhancements post restructure

Open hrshtt opened this issue 6 months ago • 1 comments

As we do #229, it will be good to keep track of high-roi, quality of life enhancements.

Problem 1:

We have too many run.py-style files scattered across the codebase. they all do very similar things: parse args, load configs, spin up multiprocessing, init resources, each with some tweaks. Multiple entrypoints will not be maintainable with a package approach unless planned carefully.

duplicate code across 10+ files
more effort to update shared logic
inconsistent interfaces for doing nearly the same things

Proposed fix:

Replace all scattered run.py files with a single unified CLI:

python -m fle.run eval --algorithm independent --config config.json
python -m fle.run server --transport stdio
python -m fle.run data --operation trace --version 330

Also create an Abstract Shared Patterns

Introduce some clean base classes to avoid repeating ourselves:

BaseRunner: shared structure
AsyncRunner, MultiprocessRunner: handle async & multiproc setups
ConfigRunner: deals with JSON config parsing

Problem 2:

Source of truth config scattered across multiple files: Configuration and path management is inconsistent and duplicated throughout the codebase, with no centralized approach for discovering paths or managing environment variables.

Proposed fix:

ConfigManager: fle/config.py Single source of truth for all paths and settings
Environment discovery: Automatic path resolution with env var overrides and sensible defaults Gives us:
Clear interprocess communication and spawning: All processes point to the same or deterministically hierarchal configs
Helps with logging: Passed configs will help create context for logs.
Sets us up for ci/cd: Can have cleaner separation for dev/prod configs.

Jun 29 '25 07:06 hrshtt

is this what you are thinking for config.py? https://github.com/JackHopkins/factorio-learning-environment/wiki/c

from pydantic import BaseModel, Field, validator
from typing import Optional, List
from pathlib import Path

class FLEConfig(BaseModel):
    # Database
    db_type: str = Field(default="sqlite", description="Database type: sqlite or postgres")
    sqlite_db_file: Optional[str] = Field(default=".fle/data.db")
    
    # API Keys (optional for basic usage)
    openai_api_key: Optional[str] = None
    anthropic_api_key: Optional[str] = None
    
    # Docker/Cluster
    cluster_name: Optional[str] = None
    
    @validator('db_type')
    def validate_db_type(cls, v):
        if v not in ['sqlite', 'postgres']:
            raise ValueError('db_type must be "sqlite" or "postgres"')
        return v
    
    @classmethod
    def from_env_file(cls, env_path: str = ".env"):
        """Load config from .env file"""
        from dotenv import load_dotenv
        import os
        
        load_dotenv(env_path)
        
        return cls(
            db_type=os.getenv("FLE_DB_TYPE", "sqlite"),
            sqlite_db_file=os.getenv("SQLITE_DB_FILE", ".fle/data.db"),
            openai_api_key=os.getenv("OPENAI_API_KEY"),
            anthropic_api_key=os.getenv("ANTHROPIC_API_KEY"),
            cluster_name=os.getenv("CLUSTER_NAME"),
        )
    
    def validate_for_experiment(self) -> List[str]:
        """Return list of missing required fields for experiments"""
        missing = []
        
        # For experiments, you need at least one API key
        if not self.openai_api_key and not self.anthropic_api_key:
            missing.append("At least one API key (OPENAI_API_KEY or ANTHROPIC_API_KEY)")
        
        return missing

Jul 08 '25 10:07 kiankyars