OpenBB
OpenBB copied to clipboard
Added forecast extension
Description
Hello Everyone,
I hope this message finds you well. Thank you for creating and maintaining such a fantastic package. I am excited to submit this pull request for the forecast extension using SDK-v4.
In this pull request, I have included the following models for forecasting:
Regression Models:
- Linear Regression
- Exponential Smoothing
Statistical Models:
- Quantile Anomaly Detection
- AutoARIMA
- AutoCES
- AutoETS
- MSTL
Torch Models:
- BRNN (Block Recurrent Neural Network)
I have followed the conventions set by the technical analysis (ta) extension to ensure consistency and maintainability. This approach aligns with the existing standard data model and REST API.
How has this been tested?
- I have followed the test structure, like the technical analysis (ta) extension.
- All the tests (API and Python) are included in the integration folder.
- [x] Make sure affected commands still run in terminal
- [x] Ensure the SDK still works
- [x] Check any related reports
Checklist:
- [x] I have adhered to the GitFlow naming convention, and my branch name is in the format of
feature/feature-nameorhotfix/hotfix-name. - [x] Update our documentation following these guidelines. Update any user guides that are affected by the changes.
- [ ] Update our tests following these guidelines.
- [x] Make sure you are following our CONTRIBUTING guidelines.
- [x] If a feature was added, make sure to add it to the corresponding integration test script.
Others
- [x] I have performed a self-review of my own code.
- [x] I have commented on my code, particularly in hard-to-understand areas.
⚠️ Disclaimer: This is a rewrite of the original openbb_terminal feature in v4 SDK. Credit goes to its original authors.
This is going to need some thought on dependencies. I cannot install this extension the same way as others due to all the stuff pulled in by darts.
Thanks a lot for taking the time to submit this!
I was able to install, lightgbm took forever to install though. GCC as a requirement does complicate things for a simple install, and making all the packages play together nicely across system configurations will be a challenge. IME, SKLearn can introduce different versions of packages that might be Intel-optimized. This can cause conflicts in environments with ARM bindings.
I do get output though. I ran the linear_regression model on AAPL, and I got a couple of warnings. First one seems fairly straightforward, but I'm not familiar enough with Scikit-Learn to know the implications of the second.
pydantic/_internal/_fields.py:128: UserWarning: Field "model_name" has conflict with protected namespace "model_".
Warning_(category='UserWarning', message='`sklearn.utils.parallel.delayed` should be used with `sklearn.utils.parallel.Parallel` to make it possible to propagate the scikit-learn configuration of the current thread to the joblib workers.')
For the output, what I'm noticing is that the index information is lost - corresponding date values to reconstruct the time series. I am looking for a way to reconcile the differences in length between the input series and the output of ticker_series, which has 200+ more rows. The first and last values are the same, so are the extra ones representing weekends/holidays?
In [71]: df.results.ticker_series[-1]
Out[71]: Data(close=184.8000030518)
In [72]: data.iloc[-1]["close"]
Out[72]: 184.8
In [73]: data.iloc[0]["close"]
Out[73]: 77.62
In [74]: df.results.ticker_series[0]
Out[74]: Data(close=77.6200027466)
@jmaslek and @deeleeramone thank you for your input!
I understand that the installation process for this extension is posing challenges due to the dependencies. In this direction, what is best course of action? Any input is highly appreciated. I can add more instructions to the README.md file.
@deeleeramone, from the Darts documentation:
TimeSeries are guaranteed to:
1. Have a monotonically increasing time index, without holes (without missing dates)
2. ....
So, when we are converting the Pandas dataframe to Timeseries object, we are filling the missing dates.
I installed the extension in a clean conda env (using dev_install.py) with the updated .toml file without hiccups. Any feedback or verification on this is highly appreciated. Thank you!
[tool.poetry]
name = "openbb-forecast"
version = "0.1.0a4"
description = "Forecast Model for OpenBB"
authors = ["OpenBB Team <[email protected]>"]
readme = "README.md"
packages = [{ include = "openbb_forecast" }]
[tool.poetry.dependencies]
python = ">=3.8,<3.12"
scipy = "^1.10.1"
statsmodels = "^0.14.0"
scikit-learn = "^1.3.1"
u8darts = { extras = ["torch"], version = "^0.23.0" }
tensorboard = "^2.2.0"
openbb-core = "^1.0.0b0"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
[tool.poetry.plugins."openbb_core_extension"]
forecast = "openbb_forecast.forecast_router:router"
So, when we are converting the Pandas dataframe to Timeseries object, we are filling the missing dates.
For Crypto assets, 24/7 is a safe assumption. But, not for other financial assets. FX trades 24/5 and stocks much less. How can we narrow the focus to market days? With a Mon-Fri schedule (mostly...), introducing weekends by copying the price values brings in unwanted, and incorrect, pieces of data. This will be especially true for intraday data, where market hours are only 6.5 hours out of the day.
Currently, the same behavior is present in openbb_terminal forecast module.
Currently, the same behavior is present in openbb_terminal forecast module.
Yes, but that was applied as a band-aid, so it wasn't intended as a permanent solution.
I installed the extension in a clean conda env (using dev_install.py) with the updated .toml file without hiccups. Any feedback or verification on this is highly appreciated. Thank you!
You will already have a C++ compiler installed; however, a lot of people do not and this can be a considerable pain point for many.
One item to bear in mind is that the pip install is not "smart" - it will not manage your environment for you and it can easily build the wrong wheels when things are installed in the "wrong" order. This is one of the major challenges we have with the current incarnation.
Currently, the same behavior is present in openbb_terminal forecast module.
Yes, but that was applied as a band-aid, so it wasn't intended as a permanent solution.
Ah understood! I will try to address this.
One item to bear in mind is that the
pip installis not "smart" - it will not manage your environment for you and it can easily build the wrong wheels when things are installed in the "wrong" order. This is one of the major challenges we have with the current incarnation.
Yeah even I faced the wrong order problem a few times.
Just an update here. This PR is blocked by our work on the package builder that should make extensions with custom models integrate nicely. Until then this is on hold. @HemuManju can I kindly ask you to update the base branch of this PR to develop and open your fork to contributions, please?
Thanks for the update! I have changed base branch of this PR to develop.
Edit: My git branch is messed up. So I will create a new PR once the package builder is updated.
Hey @HemuManju any plans of reviving this work? Let me know and I can help out. Also @mmistroni was interested in it and suggested he can help out.
P.S. It would probably be better to create the extension in a separate repo.
@piiq, I can reopen this PR. I need to update the code based on the changes made to the OpenBB platform.
Thanks @HemuManju 🙏, I'll have a look and create a PR into your fork with some suggestions
Thank you @piiq 🙏
Hey @HemuManju I've taken a look and fired it up. It actually launches with no changes to the code both on the python and on the cli side. That's good news. I haven't actually managed to produce any outputs, but I haven't looked beyond dependency management yet.
Here are some thought that I have that will help to move forward:
- There are some UX bumps when installing the extension on macos related to lightgbm not still having apple silicon wheels. Lightgbm can be installed from conda to make it work, but installing this extension would not be a simple
pip installfor mac users. - To me it would make more sense if the extension lives in a distinct repo and not in the main openbb repo because of both this dependency issue and to highlight that extensions can be created and distributed independently from the source code of the platform. I am both fine if this repo would be owned and maintained by you or openbb, let me know your preference
Additional notes on things I've spotted and what we can look at after we smooth out the dependency/installation flow:
- We would need to invest a bit more into better exception handling (at least on the cli) AND to make this work with charts. For both of these topics we would require @hjoaquim to chip in.
- The second part (charts) would require extra efforts from our side because current implementation of the charting extension is not extendable itself and we're still discovering a good way to allowing extensions have their own charts.
Hello @piiq, since this is a rewrite of the original openbb_terminal feature (credit goes to its original authors), the repo can live in openbb, and I can maintain it.
Hey @HemuManju
I've created a new repo and moved the code from this PR there. Here it is https://github.com/OpenBB-finance/openbb-forecast/pull/1
I've updated some dependency settings and added instructions how to set the extension up on macOS.
Would you like to connect in a call for me to onboard you?
Hello @piiq, thank you for the update. Shall I setup a Zoom meeting?