ibind icon indicating copy to clipboard operation
ibind copied to clipboard

[Enhancement] Pydantic models for IBind!

Open weklund opened this issue 7 months ago • 2 comments

Describe Enhancement

Introduce Pydantic models for all request inputs and response objects within the IBind library. This includes adding response types to all mixin functions to validate we're enumerating IBKR API correctly, since currently we just dump to the response back to user. This will also include any @dataclass that we use.

Context

Currently, interacting with the IBind client requires extensive manual type checking and validation, leading to verbose and error-prone code. For instance, when executing market orders, developers must handle various type cases and validate responses manually, here's an example of what I needed to do for a market order:

def place_market_order(
    client, # IbkrClient
    account_id: str,
    symbol: str,
    side: str,
    size: int,
    logger
) -> Optional[Tuple[str, float, int]]:
    """Places a market order and waits for fill.

    Args:
        client: IbkrClient instance.
        account_id: Account ID string.
        symbol: Symbol string (e.g., 'MNQ').
        side: 'BUY' or 'SELL'.
        size: Order quantity.
        logger: Logger instance.

    Returns:
        Tuple (order_tag, fill_price, conid) if successful, else None.

    """
    logger.info(f"Attempting to place a {side} market order for {size} contract(s) of {symbol}...")
    conid = None
    order_tag = None
    fill_price = None

    try:
        logger.info(f"Looking up active front-month contract for {symbol}...")
        conid = get_active_front_month_contract(client, symbol)
        if not conid:
            logger.error(f"Could not find active contract for {symbol}")
            return None
        logger.info(f"Found active contract conid: {conid}")

        order_tag = f'{ORDER_TAG_PREFIX}_{symbol}_{side}_{size}_{datetime.datetime.now().strftime("%Y%m%d_%H%M%S")}'
        order_request = OrderRequest(
            conid=conid,
            side=side.upper(),
            quantity=size,
            order_type='MKT',
            tif='DAY',
            acct_id=account_id,
            coid=order_tag
        )
        logger.info(f"Created Order Request: {order_request}")

        answers = {
            QuestionType.PRICE_PERCENTAGE_CONSTRAINT: True,
            QuestionType.ORDER_VALUE_LIMIT: True,
            "You are submitting an order without market data...": True, # Abbreviated for brevity
            'The following order .* size exceeds the Size Limit .*': True,
            "You are about to submit a stop order. Please be aware of the various stop order types available and the risks associated with each one.Are you sure you want to submit this order?": True,
            "Unforeseen new question": True,
        }
        logger.info("Defined answers for potential confirmation prompts.")

        logger.info(f"Submitting {order_request.side} order for {order_request.quantity} of conid {order_request.conid} (Tag: {order_tag})...")
        placement_result = client.place_order(order_request, answers, account_id)

        if not placement_result or (hasattr(placement_result, 'data') and not placement_result.data):
             logger.warning(f"Order placement for {symbol} might have failed or requires confirmation. Result: {placement_result}")
        else:
            logger.info(f"Order placement submitted. Result data (if any): {getattr(placement_result, 'data', 'N/A')}")

        filled_order_details = wait_for_order_fill(client, account_id, order_tag, logger)

        if filled_order_details:
            fill_price_str = filled_order_details.get('avgPrice')
            if fill_price_str is not None:
                try:
                    fill_price = float(fill_price_str)
                    logger.info(f"Market order {order_tag} filled at average price: {fill_price}")
                    return order_tag, fill_price, conid
                except ValueError:
                    logger.error(f"Could not convert fill price '{fill_price_str}' to float for order {order_tag}.")
                    return None
            else:
                logger.error(f"Filled order {order_tag} details received, but 'avgPrice' is missing: {filled_order_details}")
                return None
        else:
            logger.error(f"Market order {order_tag} did not fill within timeout.")
            # Attempt cancellation (best effort)
            logger.warning(f"Attempting to cancel potentially unfilled market order {order_tag}...")
            try:
                live_orders_result = client.live_orders(account_id=account_id)
                order_to_cancel = None
                if live_orders_result and isinstance(live_orders_result.data, dict) and 'orders' in live_orders_result.data:
                    orders_list = live_orders_result.data.get('orders', [])
                    if isinstance(orders_list, list):
                        for order in orders_list:
                            if isinstance(order, dict) and order.get('order_ref') == order_tag:
                                ib_order_id = order.get('orderId')
                                if ib_order_id and order.get('status') not in ['Filled', 'Cancelled', 'Expired', 'Inactive']:
                                    order_to_cancel = ib_order_id
                                    break
                if order_to_cancel:
                    cancel_result = client.cancel_order(order_id=order_to_cancel, account_id=account_id)
                    logger.info(f"Cancellation attempt result for IB order ID {order_to_cancel} (tag {order_tag}): {cancel_result}")
                else:
                    logger.warning(f"Could not find active, non-filled IB order ID for tag {order_tag} to cancel.")
            except Exception as cancel_err:
                logger.exception(f"Failed to attempt cancellation for order tag {order_tag}: {cancel_err}")
            return None

    except Exception as e:
        logger.exception(f"An error occurred during market order placement/monitoring for {symbol}: {e}")
        if order_tag:
             logger.error(f"An error occurred after potentially submitting order {order_tag}. State uncertain.")
        return None

The lack of structured data models necessitates repetitive and defensive programming. For example, after placing an order, I must verify the structure and types of the response data manually before proceeding. This approach is not only time-consuming but also increases the risk of runtime errors.

By adopting Pydantic models, we can:

  • Ensure data integrity through automatic validation.
  • Provide clear and informative error messages.
  • Enhance developer experience with IDE support and type hints.
  • Reduce boilerplate code and simplify unit testing.

Automatic validation

Pydantic validates types at runtime, catching issues early (e.g. string instead of float, invalid enum values). With dataclass, incorrect types silently pass through unless manually validated.

from pydantic import BaseModel, Field

class OrderRequestModel(BaseModel):
    conid: int
    side: Literal["BUY", "SELL"]
    quantity: float
    order_type: str
    acct_id: str
    price: Optional[float] = None

Passing quantity="100" would raise an error immediately — unlike a dataclass.

Reusable Schemas for Response Validation

Many IBKR responses are opaque or inconsistent. Pydantic allows us to strictly define and parse those, ensuring downstream safety.

class OrderResponseModel(BaseModel):
    order_id: int
    status: Literal["Submitted", "Filled", "Cancelled"]
    filled_quantity: float
    avg_fill_price: float

IDE support

Pydantic provides full IntelliSense and autocomplete in editors like VSCode and Intellij, improving DX and reducing error rates in larger projects.

Possible Implementation

I can make a test PR that will attempt making pydantic with a mixin that maybe is not used as much? Or maybe or straight into something like Orders? I'm open.

We can do a few things in this test PR:

  • Create BaseModel subclasses for types of a particular mixin
  • Use Pydantic’s alias feature to handle camelCase transformation
  • Add .from_dict() and .dict(by_alias=True) usage in client logic
  • Gradually roll out to a few objects.
  • Create some sort of test to see how it works in validation error scenarios to see a working example.

Once we align on a pattern for a single small PR with a working example, we could even triage the other mixins to speed up implementation.

For illustration purposes here what a model could look like:

# --- IBindOrderRequestModel ---
class IBindOrderRequestModel(BaseModel):
    conid: int
    
    # Using Literal for fields with a fixed set of string values
    side: Literal['BUY', 'SELL']
    quantity: int
    order_type: Literal['MKT', 'LMT', 'STP', 'TRAIL', 'REL', 'MIDPRICE'] # Add all valid order types from IBKR/ibind
    tif: Literal['DAY', 'GTC', 'IOC', 'FOK', 'OPG'] # Add all valid Time-In-Force values
    
    coid: str  # Client Order ID (tag)

    account_id: str = Field(alias="acctId")
    order_type: str = Field(alias="orderType")
    
    # Price is optional, typically used for LMT, STP orders
    price: Optional[float] = None
    
    model_config = ConfigDict(extra='ignore')

    # You could add validators here if needed, for example, to ensure
    # 'price' is provided for 'LMT' or 'STP' order types.
    # from pydantic import model_validator
    #
    # @model_validator(mode='after')
    # def check_price_for_relevant_order_types(self) -> 'IBindOrderRequestModel':
    #     if self.order_type in ['LMT', 'STP'] and self.price is None:
    #         raise ValueError(f"Price must be provided for order type {self.order_type}")
    #     if self.order_type == 'MKT' and self.price is not None:
    #         # Or just log a warning, as IB might ignore it for MKT orders
    #         # For strictness, you might want to ensure it's None
    #         # logger.warning("Price was provided for a MKT order and will likely be ignored.")
    #         pass # MKT orders usually don't take price, but API might allow it
    #     return self

I think this is an ideal case, but here's what my new market order code could look like:


    try:
        logger_param.info(f"Looking up active front-month contract for {symbol}...")
        # Assumption: helper returns a dict suitable for ContractDetailModel, or None if not found.
        raw_contract_data = get_active_front_month_contract(client, symbol)

        if not raw_contract_data: # Semantic check: contract not found
            logger_param.error(f"No contract data returned by helper for {symbol}.")
            return None
        
        # Pydantic parses raw_contract_data. Expects dict.
        # TypeError if not dict-like, ValidationError if fields mismatch.
        contract_detail = ContractDetailModel(**raw_contract_data)
        conid_to_return = contract_detail.conid # conid is mandatory in ContractDetailModel
        logger_param.info(f"Found active contract conid: {conid_to_return}")

        order_tag = f'{ORDER_TAG_PREFIX}_{symbol}_{side}_{size}_{datetime.datetime.now().strftime("%Y%m%d_%H%M%S")}'
        
        # Pydantic validates 'side', 'order_type', 'tif' against Literals here.
        order_request_model = IBindOrderRequestModel(
            conid=conid_to_return, side=side.upper(), quantity=size,
            order_type='MKT', tif='DAY', acct_id=account_id, coid=order_tag
        )
        logger_param.info(f"Created Pydantic Order Request: {order_request_model.model_dump_json(indent=2)}")

        answers_dict = {
            QuestionType.PRICE_PERCENTAGE_CONSTRAINT: True,
            QuestionType.ORDER_VALUE_LIMIT: True,
            "You are submitting an order without market data...": True, # Abbreviated
            "Unforeseen new question": True, # Catch-all for new questions
        } # Simplified for brevity

        logger_param.info(f"Submitting {order_request_model.side} order (Tag: {order_tag})...")
        
        # Assumption: client.place_order().data exists. Its content structure is consistent
        # (e.g., always a list of confirmations, or OrderPlacementResponseModel handles variations).
        placement_result_raw = client.place_order(order_request_model.model_dump(by_alias=True), answers_dict, account_id)

[...]
        

Not sure I did the best explaining here, but this is a really good article for additional context https://dev.to/jamesbmour/part-3-pydantic-data-models-4gnb

I'd like to bring this up for discussion now, as I think there would be throw away work if we started on unit tests, and didn't include Pydantic models in that as well. Thanks for reading!

weklund avatar May 14 '25 16:05 weklund

@weklund thanks for this detailed proposal! I'm open to introducing Pytdantic, but as always - there seems to be arguments on both sides so... let's discuss 🚀

A couple of points to start with:

  • I did a bit of reading on Pydantic, articles that advocate it and ones who advocate against it, along with the one you linked. I also revised other issues where Pydantic was mentioned to understand what we could be fixing by introducing it.
  • I understand that it is considered a good standard for an open source library, especially the type IBind is - that is a solid point for me

I think it would help me to understand it a little more if you could think of a different example to demonstrate its benefits, as:

  • The second code block looks shorter at a glance, but upon inspecting I notice that it's simply because some lines that have nothing to do with Pydantic are removed - like logging, Question/Answer pairs, or breaking out function calls into several lines. Let's compare apples to apples.
  • In the second code block you skip all the code that happens after place_order is used, which is where - if I understand argument for Pydantic correctly - I think the difference would be most meaningful.
  • The original code doesn't demonstrate the pain points directly. It contains a bunch of redundant validation, (eg. I think that (hasattr(placement_result, 'data') is always True), and most of it is just more logic - like waiting for fills, getting live orders and cancelling an order. That logic would remain there if we used Pydantic, wouldn't it? Or if it wouldn't, then it's not clear to me, as currently the example just says [...].

I assume what you wanted me to focus was the output validation with things like

if fill_price_str is not None:
if isinstance(orders_list, list):
if isinstance(order, dict) and order.get('order_ref') == order_tag:

Right? If so, like I say, another example that doesn't contain that much complex logic - present irrespectively of whether Pydantic is used or not - but focuses on output validation, could be a better basis for discussion. Output validation sounds totally reasonable to me, but I think we need better examples to focus on - where both code samples demonstrate how its done, not only the non-Pydantic one.


Maybe we can talk concretely already? How about you try introducing a simple PR with Pydantic model/schema applied to an endpoint with complex output (eg. historical market data and/or live orders, as both have loads of output fields, possibly even nested, and may require pre-flights). This way we could show: "this is all the validation code each user needs to write without Pydantic, and boom - here is how clean it looks if we do it for them" and have the benefits nicely laid out.

Having this in PR would also be useful to test one common argument I've seen against Pydantic - speed cost. If it's negligible for IBind's use, then I'd love to just test it once and not worry about it again.


Finally, I have a couple of questions:

  1. Can we make Pydantic optional without making too much of a mess? Could a user decide whether to receive raw data or validated with Pydantic?
  2. Can we use Pydantic only for some endpoints? Would that even make sense? I haven't used it, so just trying to think if this would be confusing or if there's a clear way to do this.

If answers are more 'yes' than 'no', it would be easier to introduce Pydantic slowly and see what is the reception. Otherwise, it sounds like a major release that will need more thoroughness when introducing.


All in all, fantastic to have you suggest Pydantic and lay it out in so much detail 🙌 I hope we can find a clean way to understand it and introduce it

Voyz avatar May 17 '25 11:05 Voyz

@weklund just wanted to bump this to see if we could discuss it further 👍

Voyz avatar Jun 23 '25 11:06 Voyz