BayesNets.jl
BayesNets.jl copied to clipboard
Compilation error: infer_number_of_instantiations assumes values in 1:N, value 0 found!
Hi,
I am trying to learn a discrete Bayesian network (BN) from a dataset. During the structural learning, I encountered the compilation error "infer_number_of_instantiations assumes values in 1:N, value 0 found!" and I am not sure why it happened.
Code producing this error:
parameters = GreedyHillClimbing(ScoreComponentCache(df), max_n_parents=1, prior=UniformPrior())
bn = fit(DiscreteBayesNet, df, parameters)
My dataset (df) looks like this:
and by running the following:
eltype.(eachcol(df))
the corresponding output is:
12-element Vector{DataType}:
Int64
Int64
Int64
Int64
Int64
Int64
Int64
Int64
Int64
Int64
Int64
Int64
The entire error output:
infer_number_of_instantiations assumes values in 1:N, value 0 found!
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] infer_number_of_instantiations(arr::Vector{Int64})
@ BayesNets.CPDs ~/.julia/packages/BayesNets/yBu0u/src/CPDs/utils.jl:63
[3] (::BayesNets.var"#63#66"{DataFrame})(i::Int64)
@ BayesNets ~/.julia/packages/BayesNets/yBu0u/src/DiscreteBayesNet/greedy_hill_climbing.jl:63
[4] map!(f::BayesNets.var"#63#66"{DataFrame}, dest::Vector{Int64}, A::UnitRange{Int64})
@ Base ./abstractarray.jl:2860
[5] fit(::Type{DiscreteBayesNet}, data::DataFrame, params::GreedyHillClimbing)
@ BayesNets ~/.julia/packages/BayesNets/yBu0u/src/DiscreteBayesNet/greedy_hill_climbing.jl:66
[6] top-level scope
@ ~/Desktop/bayes-aqp/Julia/bayes-aqp.ipynb:1
[7] eval
@ ./boot.jl:373 [inlined]
[8] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
@ Base ./loading.jl:1196
[9] #invokelatest#2
@ ./essentials.jl:716 [inlined]
[10] invokelatest
@ ./essentials.jl:714 [inlined]
[11] (::VSCodeServer.var"#150#151"{VSCodeServer.NotebookRunCellArguments, String})()
@ VSCodeServer ~/.vscode/extensions/julialang.language-julia-1.5.11/scripts/packages/VSCodeServer/src/serve_notebook.jl:18
[12] withpath(f::VSCodeServer.var"#150#151"{VSCodeServer.NotebookRunCellArguments, String}, path::String)
@ VSCodeServer ~/.vscode/extensions/julialang.language-julia-1.5.11/scripts/packages/VSCodeServer/src/repl.jl:185
[13] notebook_runcell_request(conn::VSCodeServer.JSONRPC.JSONRPCEndpoint{Base.PipeEndpoint, Base.PipeEndpoint}, params::VSCodeServer.NotebookRunCellArguments)
@ VSCodeServer ~/.vscode/extensions/julialang.language-julia-1.5.11/scripts/packages/VSCodeServer/src/serve_notebook.jl:14
[14] dispatch_msg(x::VSCodeServer.JSONRPC.JSONRPCEndpoint{Base.PipeEndpoint, Base.PipeEndpoint}, dispatcher::VSCodeServer.JSONRPC.MsgDispatcher, msg::Dict{String, Any})
@ VSCodeServer.JSONRPC ~/.vscode/extensions/julialang.language-julia-1.5.11/scripts/packages/JSONRPC/src/typed.jl:67
[15] serve_notebook(pipename::String; crashreporting_pipename::String)
@ VSCodeServer ~/.vscode/extensions/julialang.language-julia-1.5.11/scripts/packages/VSCodeServer/src/serve_notebook.jl:94
[16] top-level scope
@ ~/.vscode/extensions/julialang.language-julia-1.5.11/scripts/notebook/notebook.jl:12
[17] include(mod::Module, _path::String)
@ Base ./Base.jl:418
[18] exec_options(opts::Base.JLOptions)
@ Base ./client.jl:292
[19] _start()
@ Base ./client.jl:495
Can anyone help me with this? Thank you so much!
Hi,
I just tested my code and it seems that this caused by the 0s in my dataset. I wonder why 0s are not considered as acceptable data values (just by curious)?
Hello SEICS,
I took a look, and infer_number_of_instantiations has the following docstring:
"""
infer_number_of_instantiations{I<:Int}(arr::AbstractVector{I})
Infer the number of instantiations, N, for a data type, assuming that it takes on the values 1:N
"""
As such, it assumes values between 1 and N for some N. Values of 0 would be out of bounds.
This assumption basically allows us to use Julia 1-based indices to index into count tables. The easiest way to convert a dataset to 1:N form is to use the categorical discretizer in Discretizers.jl.
The documentation right now does not emphasize this assumption particularly well. We do have the following for categorical CPDs:
and our discrete Bayesian networks are comprised of them.
I hope that helps!
Ah! Thank you for the explanation! I am new to Julia also, so I don't know that Julia uses 1-based indices. Really helpful advice! I will give it a try to the Discretizers.jl.