BAMT BAMT 2.0.0 - new features, refactoring, architecture refreshment

BAMT 2.0.0 - new features, refactoring, architecture refreshment

Open jrzkaminski opened this issue 10 months ago • 1 comments

Current BAMT architecture has a number of disadvantages, some clunky code and other limitations. Thus, it was decided to make a full refactoring. This refreshment will not only include new refactored code and API but also new features (like vectorized sampling and other operations, new algorithms for structure learning, score-functions etc.) For now, here is a checklist of modules that should be implemented in 2.0.0 architecture:

[ ] core
- [ ] Graph
  - [ ] DAG
- [ ] Nodes
  - [ ] root nodes
    - [ ] discrete root node
    - [ ] continuous root node
  - [ ] child nodes
    - [ ] conditional continuous node
    - [ ] conditional discrete node
- [ ] Node models
  - [ ] Prediction models
    - [ ] Classifier
    - [ ] Regressor
  - [x] Distribution models
    - [x] Empirical distribution model
    - [x] Continuous distribution model
[ ] DAG-opttimizers module
[ ] Score-functions module
- [ ] K2
- [ ] MI
- [ ] BIC/AIC
[ ] Parameter estimators module
- [ ] MLE
[ ] Models
- [ ] PGM
  - [ ] BNs
    - [ ] Continuous BN
    - [ ] Discrete BN
    - [ ] Hybrid BN
    - [ ] Composite BN

The development of BAMT 2.0.0 is held in 2.0.0 branch of the repository. If you, the reader of the issue, have decided to implement some module or submodule, please reply to this message, create a separate issue and add it to milestone and project.

The goal of these changes is also to make a sklearn-like interface, so the usual pipeline looks like that:

# read data
data = pd.read_csv("data.csv")

# define optimizers and score functions
dag_score_function = DAGScoreFunction(**parameters)
dag_optimizer = DAG_optimizer(**parameters)


# get a structure, maybe in networkx format?
G = dag_optimizer.optimize(data, ** parameters)

# define parameters estimator and BN
parameters_estimator = ParametersEstimator(**parameters)
bn = ContinuousBayesianNetwork(**parameters)

# fit the bn
bn.fit(data, ParametersEstimator, **parameters)
bn.sample(1000)
bn.predict(data.drop[["col1", "col2"]])

Apr 22 '24 15:04 jrzkaminski

BAMT BAMT copied to clipboard

BAMT 2.0.0 - new features, refactoring, architecture refreshment

BAMT
BAMT copied to clipboard