ProbabilisticCircuits.jl
ProbabilisticCircuits.jl copied to clipboard
Limits to missingness handled by `learn_circuit_miss`?
I have a situation where I need to train models with lots of missing information, i.e. the training matrix is very sparse (on the order of more than 90% of values are missing (in a sub-matrix of the full training matrix)). There are many training instances overall but for each one only a small subset of the covariates are active/non-missing.
Is there some known limit to the missingness that can be handled?
When I try training with this data I get an AssertionError:
c1 = learn_circuit_miss(df1; maxiter = 50)
Iteration 0/50. Marginal LogLikelihood = -11.778915; nodes = 1481; edges = 2050; params = 910
ERROR: LoadError: AssertionError: Parameters do not sum to one locally: 1.137407764369371; [-0.2097739565374725, -1.1188958040892987]
Stacktrace:
[1] (::ProbabilisticCircuits.var"#103#104"{LogicCircuits.BitCircuit{Vector{Int32}, Matrix{Int32}}, Vector{Float64}, Float64, Float64})(pn::StructSumNode)
@ ProbabilisticCircuits ~/.julia/packages/ProbabilisticCircuits/1cjGx/src/parameter_learn/parameters.jl:137
[2] (::DirectedAcyclicGraphs.var"#1#2"{ProbabilisticCircuits.var"#103#104"{LogicCircuits.BitCircuit{Vector{Int32}, Matrix{Int32}}, Vector{Float64}, Float64, Float64}, StructSumNode, Dict{DirectedAcyclicGraphs.DAG, Nothing}})()
@ DirectedAcyclicGraphs ~/.julia/packages/DirectedAcyclicGraphs/teMfW/src/dags.jl:82
[3] get!(default::DirectedAcyclicGraphs.var"#1#2"{ProbabilisticCircuits.var"#103#104"{LogicCircuits.BitCircuit{Vector{Int32}, Matrix{Int32}}, Vector{Float64}, Float64, Float64}, StructSumNode, Dict{DirectedAcyclicGraphs.DAG, Nothing}}, h::Dict{DirectedAcyclicGraphs.DAG, Nothing}, key::StructSumNode)
...
I can provide a fuller stack trace if it would be useful. It is 65 levels deep though. ;)
Thanks for the report. At the moment in the next version v0.4 both learn_circuit_miss
and learn_circuit
will be gone.
Our v0.4 is in master branch and is fairly stable now, some example scripts here. We are planning on releasing soon after some more testing and documentation.
There is major API changes so some code change would be needed. (For example, not using DataFrames anymore and just using Matrix{Union{Missing, ....}}
for queries.
Comment for v0.3.3
In case, you want to stay with v0.3.3 for now.
What learn_circuit_miss
does different than learn_circuit
is that it uses imputation to generate the intial structure (using ChowLiu algorithm). After that both do greedy structure learning steps by doing splits and clones.
I think bug is not from learning the initial structure, and seems to be more of from paramter learning or bad paramter initilization. I will try to reproduce the bug. If you can provide minimal code that reproduces this would be nice.
https://github.com/Juice-jl/ProbabilisticCircuits.jl/blob/f22571801a2a001c374056aa3e030d22a961094a/src/structurelearner/learner.jl#L50-L56
Alternative Structure (HCLT)
Both this options give you determnistic and structured decomposable circuite, if you don't need determinism, we suggest using HCLT structures instead (they are decomposable but not determinsitic) as they usually perform better.
-
Learn Structure using: https://github.com/Juice-jl/ProbabilisticCircuits.jl/blob/f22571801a2a001c374056aa3e030d22a961094a/src/structurelearner/hclt.jl#L385-L390
-
Learn Parameters using: https://github.com/Juice-jl/ProbabilisticCircuits.jl/blob/f22571801a2a001c374056aa3e030d22a961094a/src/parameter_learn/parameters.jl#L456-L461
The code is much more simplified in v0.4 (current in master), so might be worth the swith there is major API changes though.
Also forgot to ask, where you using CPU or GPU version?