BAMT
BAMT copied to clipboard
BAMT 2.0.0 - new features, refactoring, architecture refreshment
Current BAMT architecture has a number of disadvantages, some clunky code and other limitations. Thus, it was decided to make a full refactoring. This refreshment will not only include new refactored code and API but also new features (like vectorized sampling and other operations, new algorithms for structure learning, score-functions etc.) For now, here is a checklist of modules that should be implemented in 2.0.0 architecture:
-
[ ] core
- [ ] Graph
- [ ] DAG
- [ ] Nodes
- [ ] root nodes
- [ ] child nodes
- [ ] Node models
- [ ] Prediction models
- [ ] Classifier
- [ ] Regressor
- [x] Distribution models
- [ ] Prediction models
- [ ] Graph
-
[ ] Parameter estimators module
- [ ] MLE
-
[ ] Models
- [ ] PGM
- [ ] BNs
- [ ] Continuous BN
- [ ] Discrete BN
- [ ] Hybrid BN
- [ ] Composite BN
- [ ] BNs
- [ ] PGM
The development of BAMT 2.0.0 is held in 2.0.0 branch of the repository. If you, the reader of the issue, have decided to implement some module or submodule, please reply to this message, create a separate issue and add it to milestone and project.
The goal of these changes is also to make a sklearn-like interface, so the usual pipeline looks like that:
# read data
data = pd.read_csv("data.csv")
# define optimizers and score functions
dag_score_function = DAGScoreFunction(**parameters)
dag_optimizer = DAG_optimizer(**parameters)
# get a structure, maybe in networkx format?
G = dag_optimizer.optimize(data, ** parameters)
# define parameters estimator and BN
parameters_estimator = ParametersEstimator(**parameters)
bn = ContinuousBayesianNetwork(**parameters)
# fit the bn
bn.fit(data, ParametersEstimator, **parameters)
bn.sample(1000)
bn.predict(data.drop[["col1", "col2"]])