Lean icon indicating copy to clipboard operation
Lean copied to clipboard

Add Daily & Hour with extended market hours

Open alexschwantes opened this issue 5 months ago • 4 comments

Expected Behavior

self.history[TradeBar](self._symbol, 30, Resolution.DAILY, extended_market_hours=False) should return data only within market hours

Actual Behavior

It instead returns extended_market_hours=True data

While the following test is configured for IB data, it can also be run in the cloud, where you can compare the Actual values. They are the same for both extended_market_hours True and False, where I would expect them to be different. ie.

extended_market_hours = False
********** TEST FAILED: 2024-08-06 (MH) ********** Expected: NTES: O: 86.44 H: 87.44 L: 86.01 C: 86.65 V: 1126405 Actual : NTES: O: 83.96141 H: 85.04083 L: 83.6405 C: 84.28232 V: 1733373

extended_market_hours = True
********** TEST FAILED: 2024-08-06 (EMH) ********** Expected: NTES: O: 86.5 H: 87.44 L: 86 C: 86.67 V: 1342268 Actual : NTES: O: 83.96141 H: 85.04083 L: 83.6405 C: 84.28232 V: 1733373

Potential Solution

Reproducing the Problem

from datetime import date
from AlgorithmImports import *

"""
Test History extended_market_hours.
Result: `extended_market_hours = False` fails and instead returns the data for `extended_market_hours = True`.
Test date for data was 2024-08-05 and 2024-08-06 for NTES from IB.
"""


class BUG_HISTORY_EMH(QCAlgorithm):
    def initialize(self):
        self.set_start_date(2024, 9, 1)
        self.set_end_date(2024, 10, 1)
        self.set_cash(100000)

        # Run this test with extended_market_hours set to True or False
        # Change the value of extended_market_hours to test both scenarios
        # self.extended_market_hours = True
        self.extended_market_hours = False

        # below can be run in HOUR or DAILY resolution, the result is the same
        self._symbol = self.add_equity(
            "NTES", Resolution.HOUR, extended_market_hours=self.extended_market_hours
        ).symbol

        # Test data for NTES run locally with IB data. Replace with QuantConnect data if needed.
        # Contains data for EMH and non-EMH (MH)
        self.test_data = {
            "2024-08-05": {
                "MH": TradeBar(date(2024, 8, 5), self._symbol, 86.5, 88.5, 86.5, 87.91, 1483903),
                "EMH": TradeBar(date(2024, 8, 5), self._symbol, 89.0, 89.0, 86.5, 88.8, 1539853),
            },
            "2024-08-06": {
                "MH": TradeBar(date(2024, 8, 6), self._symbol, 86.44, 87.44, 86.01, 86.65, 1126405),
                "EMH": TradeBar(date(2024, 8, 6), self._symbol, 86.5, 87.44, 86.0, 86.67, 1342268),
            },
        }
        self.tests_executed = []

        # Manual Warm up
        self.count = 1
        history = list(
            self.history[TradeBar](
                self._symbol,
                30,
                Resolution.DAILY,
                extended_market_hours=self.extended_market_hours,
            )
        )
        for bar in history:
            self.log(f"{bar.end_time}: count:{self.count} History Bar: {bar}")
            self.count += 1
            self.assert_data(bar)

    def on_data(self, data: Slice):
        pass

    def assert_data(self, bar: TradeBar) -> None:
        """
        Assert that the bar data matches the expected values stored in self.test_data
        """

        # Get the expected data for the current date
        date_str = bar.end_time.strftime("%Y-%m-%d")
        if date_str not in self.test_data:
            return  # No test data for this date

        expected_data = self.test_data[date_str]

        # Check if the bar is for extended market hours or not
        if self.extended_market_hours:
            key = "EMH"
        else:
            key = "MH"

        if key not in expected_data:
            raise ValueError(f"No expected data for {key} on date {date_str}")

        expected_bar: TradeBar = expected_data[key]

        # Do a soft assert of the expected values
        if not all(
            [
                bar.open == expected_bar.open,
                bar.high == expected_bar.high,
                bar.low == expected_bar.low,
                bar.close == expected_bar.close,
                bar.volume == expected_bar.volume,
            ]
        ):
            self.log(
                f"\n{'*' * 10} TEST FAILED: {date_str} ({key}) {'*' * 10}"
                f"\nExpected: {expected_bar}"
                f"\n Actual : {bar}"
            )
            self.tests_executed += [False]
            return

        self.log(
            f"\n{'*' * 10} TEST PASSED: {date_str} ({key}) {'*' * 10}"
            f"\nData matches: {bar.open}, {bar.high}, {bar.low}, {bar.close}, {bar.volume}"
        )
        self.tests_executed += [True]

    def on_end_of_algorithm(self):
        self.log(f"{'=' * 25}")
        self.log("Test Configuration:")
        self.log(f"Extended Market Hours: {self.extended_market_hours}")
        self.log(f"{'=' * 25}")
        self.log("Test Results:")

        if len(self.tests_executed) == 2:
            self.log("✅ 2 Tests Executed")
        else:
            self.log(f"❌ Expected 2 tests but only executed: {len(self.tests_executed)}")

        if all(self.tests_executed):
            self.log("✅ All tests passed!")
        else:
            self.log(f"❌ Tests Failed: {len(self.tests_executed) - sum(self.tests_executed)}")

        self.log(f"{'=' * 25}")

System Information

windows 10, lean 1.0.218 for local also ran in cloud

Checklist

  • [x] I have completely filled out this template
  • [x] I have confirmed that this issue exists on the current master branch
  • [x] I have confirmed that this is not a duplicate issue by searching issues
  • [x] I have provided detailed steps to reproduce the issue

alexschwantes avatar Jul 30 '25 03:07 alexschwantes

Hey @alexschwantes! Our daily and hourly data bars are already created on disk and do not contain extended market hours data atm. As a work around if you wanted daily/hour bars with extended market hours the algorithm would need to consolidate them manually

Martin-Molinero avatar Jul 30 '25 13:07 Martin-Molinero

Thanks @Martin-Molinero

For my tests, I imported data to the local data directory from Interactive Brokers using lean CLI. Unfortunately I don't have access to QuantConnect data, so can't compare what is stored on your end.

From the imported data, I can see that the hourly bars do contain extended market data, which is where I was able to get the expected results in the test above. ie.

  o h l c v
20240806 06:00 865000 865000 865000 865000 250
20240806 07:00 863100 863700 861800 863700 600
20240806 08:00 863800 863800 860000 863000 1400
20240806 09:00 864400 868200 860100 863300 180719
20240806 10:00 863100 874400 862500 869200 198497
20240806 11:00 869100 872200 866200 867800 197586
20240806 12:00 867900 868900 866500 868500 75491
20240806 13:00 868300 870400 868100 869500 75879
20240806 14:00 869800 871000 868700 868800 121381
20240806 15:00 868500 869200 865000 866500 276852
20240806 16:00 866700 866700 866700 866700 213613
20240806 17:00 866700 866700 866700 866700 0
20240806 18:00 866700 866700 866700 866700 0
20240806 19:00 866700 866700 866700 866700 0

Looking at the Daily data file I can see that the daily data does reflect the extended market hours.

  o h l c v
20240805 00:00 890000 890000 865000 888000 1539853
20240806 00:00 865000 874400 860000 866700 1342268

So at least, with my local data, the history request with Resolution.DAILY always returns the daily data containing the extended market hours as that is what is in my local daily data file from IB. As only one value is stored in the daily data, I can see that from a technical point of view, the extended_market_hours then has no effect on history requests with DAILY resolution, but would on HOUR (or less) resolution.

Q: Is this behaviour then different in a live cloud environment, ie DAILY resolution does not include extended market hours? Does the choice of data provider such as QuantConnect or IB affect this in live? (If so then it would then I would need to do as you suggest and use a consolidator on HOUR data to obtain consistent results between local and live environments)

alexschwantes avatar Jul 31 '25 01:07 alexschwantes

Hey @alexschwantes! You found a bug 👍, I've adjusted the behavior at https://github.com/QuantConnect/Lean/pull/8906, daily and hour resolutions downloaded through brokerages shouldn't include extended market hours so they can match lean. Thanks for the report! Just pulling the latest lean image will make the fix available to you

Martin-Molinero avatar Jul 31 '25 23:07 Martin-Molinero

I suspect this change may cause performance/speed regression - please run any minute resolution option backtest and you will see x3 slowdown, see more details https://github.com/QuantConnect/Lean/issues/8921

varistar avatar Aug 09 '25 07:08 varistar