Add Daily & Hour with extended market hours
Expected Behavior
self.history[TradeBar](self._symbol, 30, Resolution.DAILY, extended_market_hours=False) should return data only within market hours
Actual Behavior
It instead returns extended_market_hours=True data
While the following test is configured for IB data, it can also be run in the cloud, where you can compare the Actual values. They are the same for both extended_market_hours True and False, where I would expect them to be different. ie.
extended_market_hours = False
********** TEST FAILED: 2024-08-06 (MH) ********** Expected: NTES: O: 86.44 H: 87.44 L: 86.01 C: 86.65 V: 1126405 Actual : NTES: O: 83.96141 H: 85.04083 L: 83.6405 C: 84.28232 V: 1733373
extended_market_hours = True
********** TEST FAILED: 2024-08-06 (EMH) ********** Expected: NTES: O: 86.5 H: 87.44 L: 86 C: 86.67 V: 1342268 Actual : NTES: O: 83.96141 H: 85.04083 L: 83.6405 C: 84.28232 V: 1733373
Potential Solution
Reproducing the Problem
from datetime import date
from AlgorithmImports import *
"""
Test History extended_market_hours.
Result: `extended_market_hours = False` fails and instead returns the data for `extended_market_hours = True`.
Test date for data was 2024-08-05 and 2024-08-06 for NTES from IB.
"""
class BUG_HISTORY_EMH(QCAlgorithm):
def initialize(self):
self.set_start_date(2024, 9, 1)
self.set_end_date(2024, 10, 1)
self.set_cash(100000)
# Run this test with extended_market_hours set to True or False
# Change the value of extended_market_hours to test both scenarios
# self.extended_market_hours = True
self.extended_market_hours = False
# below can be run in HOUR or DAILY resolution, the result is the same
self._symbol = self.add_equity(
"NTES", Resolution.HOUR, extended_market_hours=self.extended_market_hours
).symbol
# Test data for NTES run locally with IB data. Replace with QuantConnect data if needed.
# Contains data for EMH and non-EMH (MH)
self.test_data = {
"2024-08-05": {
"MH": TradeBar(date(2024, 8, 5), self._symbol, 86.5, 88.5, 86.5, 87.91, 1483903),
"EMH": TradeBar(date(2024, 8, 5), self._symbol, 89.0, 89.0, 86.5, 88.8, 1539853),
},
"2024-08-06": {
"MH": TradeBar(date(2024, 8, 6), self._symbol, 86.44, 87.44, 86.01, 86.65, 1126405),
"EMH": TradeBar(date(2024, 8, 6), self._symbol, 86.5, 87.44, 86.0, 86.67, 1342268),
},
}
self.tests_executed = []
# Manual Warm up
self.count = 1
history = list(
self.history[TradeBar](
self._symbol,
30,
Resolution.DAILY,
extended_market_hours=self.extended_market_hours,
)
)
for bar in history:
self.log(f"{bar.end_time}: count:{self.count} History Bar: {bar}")
self.count += 1
self.assert_data(bar)
def on_data(self, data: Slice):
pass
def assert_data(self, bar: TradeBar) -> None:
"""
Assert that the bar data matches the expected values stored in self.test_data
"""
# Get the expected data for the current date
date_str = bar.end_time.strftime("%Y-%m-%d")
if date_str not in self.test_data:
return # No test data for this date
expected_data = self.test_data[date_str]
# Check if the bar is for extended market hours or not
if self.extended_market_hours:
key = "EMH"
else:
key = "MH"
if key not in expected_data:
raise ValueError(f"No expected data for {key} on date {date_str}")
expected_bar: TradeBar = expected_data[key]
# Do a soft assert of the expected values
if not all(
[
bar.open == expected_bar.open,
bar.high == expected_bar.high,
bar.low == expected_bar.low,
bar.close == expected_bar.close,
bar.volume == expected_bar.volume,
]
):
self.log(
f"\n{'*' * 10} TEST FAILED: {date_str} ({key}) {'*' * 10}"
f"\nExpected: {expected_bar}"
f"\n Actual : {bar}"
)
self.tests_executed += [False]
return
self.log(
f"\n{'*' * 10} TEST PASSED: {date_str} ({key}) {'*' * 10}"
f"\nData matches: {bar.open}, {bar.high}, {bar.low}, {bar.close}, {bar.volume}"
)
self.tests_executed += [True]
def on_end_of_algorithm(self):
self.log(f"{'=' * 25}")
self.log("Test Configuration:")
self.log(f"Extended Market Hours: {self.extended_market_hours}")
self.log(f"{'=' * 25}")
self.log("Test Results:")
if len(self.tests_executed) == 2:
self.log("✅ 2 Tests Executed")
else:
self.log(f"❌ Expected 2 tests but only executed: {len(self.tests_executed)}")
if all(self.tests_executed):
self.log("✅ All tests passed!")
else:
self.log(f"❌ Tests Failed: {len(self.tests_executed) - sum(self.tests_executed)}")
self.log(f"{'=' * 25}")
System Information
windows 10, lean 1.0.218 for local also ran in cloud
Checklist
- [x] I have completely filled out this template
- [x] I have confirmed that this issue exists on the current
masterbranch - [x] I have confirmed that this is not a duplicate issue by searching issues
- [x] I have provided detailed steps to reproduce the issue
Hey @alexschwantes! Our daily and hourly data bars are already created on disk and do not contain extended market hours data atm. As a work around if you wanted daily/hour bars with extended market hours the algorithm would need to consolidate them manually
Thanks @Martin-Molinero
For my tests, I imported data to the local data directory from Interactive Brokers using lean CLI. Unfortunately I don't have access to QuantConnect data, so can't compare what is stored on your end.
From the imported data, I can see that the hourly bars do contain extended market data, which is where I was able to get the expected results in the test above. ie.
| o | h | l | c | v | |
|---|---|---|---|---|---|
| 20240806 06:00 | 865000 | 865000 | 865000 | 865000 | 250 |
| 20240806 07:00 | 863100 | 863700 | 861800 | 863700 | 600 |
| 20240806 08:00 | 863800 | 863800 | 860000 | 863000 | 1400 |
| 20240806 09:00 | 864400 | 868200 | 860100 | 863300 | 180719 |
| 20240806 10:00 | 863100 | 874400 | 862500 | 869200 | 198497 |
| 20240806 11:00 | 869100 | 872200 | 866200 | 867800 | 197586 |
| 20240806 12:00 | 867900 | 868900 | 866500 | 868500 | 75491 |
| 20240806 13:00 | 868300 | 870400 | 868100 | 869500 | 75879 |
| 20240806 14:00 | 869800 | 871000 | 868700 | 868800 | 121381 |
| 20240806 15:00 | 868500 | 869200 | 865000 | 866500 | 276852 |
| 20240806 16:00 | 866700 | 866700 | 866700 | 866700 | 213613 |
| 20240806 17:00 | 866700 | 866700 | 866700 | 866700 | 0 |
| 20240806 18:00 | 866700 | 866700 | 866700 | 866700 | 0 |
| 20240806 19:00 | 866700 | 866700 | 866700 | 866700 | 0 |
Looking at the Daily data file I can see that the daily data does reflect the extended market hours.
| o | h | l | c | v | |
|---|---|---|---|---|---|
| 20240805 00:00 | 890000 | 890000 | 865000 | 888000 | 1539853 |
| 20240806 00:00 | 865000 | 874400 | 860000 | 866700 | 1342268 |
So at least, with my local data, the history request with Resolution.DAILY always returns the daily data containing the extended market hours as that is what is in my local daily data file from IB. As only one value is stored in the daily data, I can see that from a technical point of view, the extended_market_hours then has no effect on history requests with DAILY resolution, but would on HOUR (or less) resolution.
Q: Is this behaviour then different in a live cloud environment, ie DAILY resolution does not include extended market hours? Does the choice of data provider such as QuantConnect or IB affect this in live? (If so then it would then I would need to do as you suggest and use a consolidator on HOUR data to obtain consistent results between local and live environments)
Hey @alexschwantes! You found a bug 👍, I've adjusted the behavior at https://github.com/QuantConnect/Lean/pull/8906, daily and hour resolutions downloaded through brokerages shouldn't include extended market hours so they can match lean. Thanks for the report! Just pulling the latest lean image will make the fix available to you
I suspect this change may cause performance/speed regression - please run any minute resolution option backtest and you will see x3 slowdown, see more details https://github.com/QuantConnect/Lean/issues/8921