connectors icon indicating copy to clipboard operation
connectors copied to clipboard

[Template] Add pydantic-settings to handle configuration from different source (.env, yaml, environment variables, ...)

Open pdamoune opened this issue 11 months ago • 1 comments

Use case

While developing a connector, the configuration can come from different sources, for example :

  • Mostly in dev: config.yml file
  • Mostly in prod: environment variables

Current Workaround

We have a config file that extract, and set all the config from environment variables or config.yml👍

Today's Implementation is :

import os
from pathlib import Path

import yaml
from pycti import get_config_variable


class ConfigConnector:
    def __init__(self):
        """
        Initialize the connector with necessary configurations
        """

        # Load configuration file
        self.load = self._load_config()
        self._initialize_configurations()

    @staticmethod
    def _load_config() -> dict:
        """
        Load the configuration from the YAML file
        :return: Configuration dictionary
        """
        config_file_path = Path(__file__).parents[1].joinpath("config.yml")
        config = (
            yaml.load(open(config_file_path), Loader=yaml.FullLoader)
            if os.path.isfile(config_file_path)
            else {}
        )

        return config

    def _initialize_configurations(self) -> None:
        """
        Connector configuration variables
        :return: None
        """
        # OpenCTI configurations
        self.duration_period = get_config_variable(
            "CONNECTOR_DURATION_PERIOD",
            ["connector", "duration_period"],
            self.load,
        )

        # Connector extra parameters
        self.api_base_url = get_config_variable(
            "CONNECTOR_TEMPLATE_API_BASE_URL",
            ["connector_template", "api_base_url"],
            self.load,
        )

        self.api_key = get_config_variable(
            "CONNECTOR_TEMPLATE_API_KEY",
            ["connector_template", "api_key"],
            self.load,
        )

Proposed Solution

To make it easier to use, consistent cross projects and more flexible, we implemented it using pydantic-settings.

Implementation would result in :

from datetime import timedelta

from pydantic import Field, HttpUrl, SecretStr
from pydantic_settings import BaseSettings

from config.base_config import BaseConfig


class MyConnectorConfig(BaseSettings):
    duration_period: timedelta = Field(description="Duration between connector runs")
    api_base_url: HttpUrl = Field(description="API base URL.")
    api_key: SecretStr = Field(description="API key.")


class Config(BaseConfig):
    my_connector: MyConnectorConfig

Knowing that BaseConfig is a shared class of all the possible parameters for the helper :

import abc

from pydantic import HttpUrl, SecretStr
from pydantic_settings import (
    BaseSettings,
    PydanticBaseSettingsSource,
    SettingsConfigDict,
    YamlConfigSettingsSource,
)

"""
    This classes should be in pycti and be used by the OpenCTIHelper.
    
    All the commented variables have default values in the OpenCTI helper.
    - Remove old configuration from the OpenCTI helper.
    - Implement the new configuration classes in the OpenCTI helper.
    - Type and set properly all the properties of this classes
    
    Then, All the variables of this classes will be customizable through:
     .env, config.yml and/or environment variables.
     
    If a variable is set in 2 different places, the first one will be used in this order:
        1. Secret files
        2. YAML file
        3. .env file
        4. Initial settings
        5. Environment variables
        6. Default value
"""


class _BaseSettings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=".env",
        env_nested_delimiter="_",  # FIXME: Should be "__"
        env_nested_max_split=1,  # FIXME: Must find another way
        yaml_file="config.yml",
    )

    @classmethod
    def settings_customise_sources(
        cls,
        settings_cls: type[BaseSettings],
        init_settings: PydanticBaseSettingsSource,
        env_settings: PydanticBaseSettingsSource,
        dotenv_settings: PydanticBaseSettingsSource,
        file_secret_settings: PydanticBaseSettingsSource,
    ) -> tuple[PydanticBaseSettingsSource, ...]:
        """
        Define the sources and their order for loading the settings values.
        """
        return (
            file_secret_settings,  # First: secret files
            YamlConfigSettingsSource(settings_cls),  # Optional: fallback YAML file
            dotenv_settings,  # Optional: fallback to .env file
            init_settings,  # Optional: fallback to initial settings
            env_settings,  # Optional: environment variables
        )


class _OpenCTIConfig(BaseSettings):

    url: HttpUrl
    token: SecretStr

    # json_logging: bool
    # ssl_verify: bool


class _ConnectorConfig(BaseSettings):
    # TODO : Enforce typing (Literal, etc.)

    id: str
    name: str
    type: str
    scope: list[str]

    # log_level: str
    # duration_period: datetime.timedelta
    # auto: bool
    # expose_metrics: bool
    # metrics_port: int
    # only_contextual: bool
    # run_and_terminate: bool
    # validate_before_import: bool
    # queue_protocol: str
    # queue_threshold: int

    # listen_protocol: str
    # listen_protocol_api_port: int
    # listen_protocol_api_path: str
    # listen_protocol_api_ssl: bool
    # listen_protocol_api_uri: str

    # live_stream_id: str
    # live_stream_listen_delete: bool
    # live_stream_no_dependencies: bool
    # live_stream_with_inferences: bool
    # live_stream_recover_iso_date: datetime.datetime
    # live_stream_start_timestamp: datetime.datetime

    # send_to_queue: bool
    # send_to_directory: bool
    # send_to_directory_path: str
    # send_to_directory_retention: int


class BaseConfig(_BaseSettings):
    opencti: _OpenCTIConfig
    connector: _ConnectorConfig

Additional Information

Would you be willing to submit a PR?

We strongly encourage you to submit a PR if you want and whenever you want. If your issue concern a "Community-support" connector, your PR will probably be accepted after some review. If the connector is "Partner-support" or "Filigran-support", a dev team make take over but will base its work on your PR, speeding the process. It will be much appreciated.

pdamoune avatar Apr 01 '25 14:04 pdamoune

In order to help on the composer we need be sure with this approach that we can:

  • Keep a way to be retro compatible with old env variable for connectors that will be migrated.
  • Be able to express, description, if attribute is required, secret or not, default value, possible enum if possible values are restricted
  • Be able to generate a json schema contract from this configuration.
  • Be able to setup connector ecosystem information (images uri, tags, verified or not, ...)

richard-julien avatar May 25 '25 19:05 richard-julien