ProbabilisticCircuits.jl icon indicating copy to clipboard operation
ProbabilisticCircuits.jl copied to clipboard

Limits to missingness handled by `learn_circuit_miss`?

Open robertfeldt opened this issue 3 years ago • 2 comments

I have a situation where I need to train models with lots of missing information, i.e. the training matrix is very sparse (on the order of more than 90% of values are missing (in a sub-matrix of the full training matrix)). There are many training instances overall but for each one only a small subset of the covariates are active/non-missing.

Is there some known limit to the missingness that can be handled?

When I try training with this data I get an AssertionError:

c1 = learn_circuit_miss(df1; maxiter = 50)

Iteration 0/50. Marginal LogLikelihood = -11.778915; nodes = 1481; edges =  2050; params = 910
ERROR: LoadError: AssertionError: Parameters do not sum to one locally: 1.137407764369371; [-0.2097739565374725, -1.1188958040892987]
Stacktrace:
  [1] (::ProbabilisticCircuits.var"#103#104"{LogicCircuits.BitCircuit{Vector{Int32}, Matrix{Int32}}, Vector{Float64}, Float64, Float64})(pn::StructSumNode)
    @ ProbabilisticCircuits ~/.julia/packages/ProbabilisticCircuits/1cjGx/src/parameter_learn/parameters.jl:137
  [2] (::DirectedAcyclicGraphs.var"#1#2"{ProbabilisticCircuits.var"#103#104"{LogicCircuits.BitCircuit{Vector{Int32}, Matrix{Int32}}, Vector{Float64}, Float64, Float64}, StructSumNode, Dict{DirectedAcyclicGraphs.DAG, Nothing}})()
    @ DirectedAcyclicGraphs ~/.julia/packages/DirectedAcyclicGraphs/teMfW/src/dags.jl:82
  [3] get!(default::DirectedAcyclicGraphs.var"#1#2"{ProbabilisticCircuits.var"#103#104"{LogicCircuits.BitCircuit{Vector{Int32}, Matrix{Int32}}, Vector{Float64}, Float64, Float64}, StructSumNode, Dict{DirectedAcyclicGraphs.DAG, Nothing}}, h::Dict{DirectedAcyclicGraphs.DAG, Nothing}, key::StructSumNode)
...

I can provide a fuller stack trace if it would be useful. It is 65 levels deep though. ;)

robertfeldt avatar Mar 02 '22 07:03 robertfeldt

Thanks for the report. At the moment in the next version v0.4 both learn_circuit_miss and learn_circuit will be gone.

Our v0.4 is in master branch and is fairly stable now, some example scripts here. We are planning on releasing soon after some more testing and documentation.

There is major API changes so some code change would be needed. (For example, not using DataFrames anymore and just using Matrix{Union{Missing, ....}} for queries.


Comment for v0.3.3

In case, you want to stay with v0.3.3 for now. What learn_circuit_miss does different than learn_circuit is that it uses imputation to generate the intial structure (using ChowLiu algorithm). After that both do greedy structure learning steps by doing splits and clones.

I think bug is not from learning the initial structure, and seems to be more of from paramter learning or bad paramter initilization. I will try to reproduce the bug. If you can provide minimal code that reproduces this would be nice.

https://github.com/Juice-jl/ProbabilisticCircuits.jl/blob/f22571801a2a001c374056aa3e030d22a961094a/src/structurelearner/learner.jl#L50-L56

Alternative Structure (HCLT)

Both this options give you determnistic and structured decomposable circuite, if you don't need determinism, we suggest using HCLT structures instead (they are decomposable but not determinsitic) as they usually perform better.

  1. Learn Structure using: https://github.com/Juice-jl/ProbabilisticCircuits.jl/blob/f22571801a2a001c374056aa3e030d22a961094a/src/structurelearner/hclt.jl#L385-L390

  2. Learn Parameters using: https://github.com/Juice-jl/ProbabilisticCircuits.jl/blob/f22571801a2a001c374056aa3e030d22a961094a/src/parameter_learn/parameters.jl#L456-L461

The code is much more simplified in v0.4 (current in master), so might be worth the swith there is major API changes though.

khosravipasha avatar Mar 04 '22 00:03 khosravipasha

Also forgot to ask, where you using CPU or GPU version?

khosravipasha avatar Mar 04 '22 00:03 khosravipasha