bat_sample error
Hi,
I followed the tutorial to define a custom log-likelihood to give that to bat_sample(). My likelihood seems to work fine, however, bat_sample() doesn't work for me. After some time I get the error below:
AssertionError: length(chains) == nchains Stacktrace: [1] mcmc_init!(rng::Random123.Philox4x{UInt64, 10}, algorithm::MetropolisHastings{BAT.MvTDistProposal, RepetitionWeighting{Int64}, AdaptiveMHTuning}, density::PosteriorDensity{BAT.TransformedDensity{BAT.GenericDensity{var"#36#37"{Vector{Float64}, typeof(pdf)}}, BAT.DistributionTransform{ValueShapes.StructVariate{NamedTuple{(:offset, :resolution, :k)}}, BAT.InfiniteSpace, BAT.MixedSpace, BAT.StandardMvNormal{Float64}, NamedTupleDist{(:offset, :resolution, :k), Tuple{Product{Continuous, Uniform{Float64}, Vector{Uniform{Float64}}}, Uniform{Float64}, Uniform{Float64}}, Tuple{ValueAccessor{ArrayShape{Real, 1}}, ValueAccessor{ScalarShape{Real}}, ValueAccessor{ScalarShape{Real}}}}}, BAT.TDNoCorr}, BAT.DistributionDensity{BAT.StandardMvNormal{Float64}, BAT.HyperRectBounds{Float32}}, BAT.HyperRectBounds{Float32}, ArrayShape{Real, 1}}, nchains::Int64, init_alg::MCMCChainPoolInit, tuning_alg::AdaptiveMHTuning, nonzero_weights::Bool, callback::Function) @ BAT ~/.julia/packages/BAT/XvOy6/src/samplers/mcmc/chain_pool_init.jl:160
Any idea what could be the issue?
Reproducer:
using Random, LinearAlgebra, Statistics, Distributions, StatsBase, Plots
using BAT, IntervalSets, ValueShapes, TypedTables
import SpecialFunctions
using Profile, ProfileView, QuadGK
## Error function.
function my_erf(x,coeff,base)
return coeff/2*(1 .+ SpecialFunctions.erf.(x)) .+ base
[mcmc_BAT_Julia.pdf](https://github.com/bat/BAT.jl/files/11219143/mcmc_BAT_Julia.pdf)
end
## Step function.
function my_step(x,coeff,base)
return coeff/2*(1 .+ sign.(x)) .+ base
end
## Model (sum of two errors or steps).
function count(p::NamedTuple{(:offset, :resolution, :k)}, x)
step1_coeff = 6+p.k
step2_coeff = 2-p.k
if p.resolution> 0.0000001
step1_x = (x .- p.offset[1])/(sqrt(2)*p.resolution)
step1 = my_erf(step1_x,step1_coeff,4)
step2_x = (x .- p.offset[2])/(sqrt(2)*p.resolution)
step2 = my_erf(step2_x,step2_coeff,0.)
else
step1_x = (x .- p.offset[1])
step1 = my_step(step1_x,step1_coeff,4)
step2_x = (x .- p.offset[2])
step2 = my_step(step2_x,step2_coeff,0)
end
return step1+step2
end
## Integral of model
function get_integral(p::NamedTuple{(:offset, :resolution, :k)},low, high)
total,error = quadgk(x -> count(p,x),low,high)
return total
end
## PDF
function pdf(p::NamedTuple{(:offset, :resolution, :k)},x,low, high)
total = get_integral(p,low,high)
value = count(p,x)
return value/total
end
## Sampler
function sampler(p::NamedTuple{(:offset, :resolution, :k)},n,low,high)
max = count(p,high)
arr = Array{Float64}(undef, n)
i = 1
while i<=n
y = rand()*max
x = rand()*(high-low)+low
if y <= count(p, x)
arr[i] = x
i+=1
end
end
return arr
end
true_par_values = (offset = [99, 150], resolution = 5, k = 0)
arr = sampler(true_par_values,10000,0,500)
## Unbinned log-likelihood
likelihood = let data = arr, f =pdf
params -> begin
function event_log_likelihood(event)
log(f(params,event,0,500))
end
return LogDVal(mapreduce(event_log_likelihood, +, data))
end
end
## Flat priors
prior = NamedTupleDist(
offset = [Uniform(50, 150), Uniform(80, 220)],
resolution = Uniform(0,20),
k = Uniform(-6,2)
)
## Posterior
posterior = PosteriorDensity(likelihood, prior)
## running bat_sample
samples = bat_sample(posterior, MCMCSampling(mcalg = MetropolisHastings(), nsteps = 10^3)).result
My notebook with some plots is also attached here. mcmc_BAT_Julia.pdf
Hi @atasattari, BAT.jl has trouble running with only one chain, could you run with two chains at least? It's on the to-do list, we just haven't gotten around to fixing that yet (since people typically run with 4 or 8 chains).
Also I recommend using BAT.jl#main currently instead of the last 2.x release, which is quite old. We're preparing a BAT.jl v3.0 but there's some changes to be made first.
I updated the package and increased the number of chains to 4. It took almost 45 minutes to get to Begin tuning of 4 MCMC chain(s). Does that make any sense?
Would you happen to notice an obvious issue in my cost function or Julia code? Is there a way to get a progress bar while running the sampler?
Just to compare, I implemented a similar cost function in Python optimized with Numba and used emcee for sampling. With 30 walkers ('emcee' language), the computation with the same number of steps takes less than a minute which is confusing.
One important difference is that emcee can work in parallel (we should probably add an emcee-like sampler to BAT) and the many walkers share information about the posterior, so they "help each other" converges. While Metropolis or HMC single walker chains can converge more slowly, since each walker has to find the "way" to the typical set by itself. We have some code for improved burn in based on the "Pathfinder" algorithm and the Robust adaptive Metropolis Algorithm, but it's not integrated in BAT.jl yet. The plan is to add this and some other shiny new features until summer. We'll also add progress bars, so one can see how sampling progresses - BAT.jl can nest algorithms, so we need nested progress display, which makes it a bit more tricky than just using ProgressMeter.
In case it's a performance issue with you model, could you run BenchmarkTools.@benchmark logdensityof($your_posterior, rand($prior)) or so, and maybe check your likelihood implementation with @code_warntype?
Thanks so much for the hints. I performed a basic comparison between the log-likelihood call times in Julia and Python. The Python one is ~20-30% faster. I'm not a Julia expert, but there seem to be some tweaks to boost the performance. At least running in parallel seems to be much simpler in Julia. 'emcee' has issues pickling some of the Numba functions which prevent it to run in parallel. I'm not sure how many events were thrown away before convergence but it took BAT almost 8 hours to produce a sample of 10^3 events for my model. Apart from the performance, other features look really cool. I'll update you if I figure out something about the performance. Also, please let me know if you guys have a tutorial or a Jupyter Notebook that shows some performance optimization or fancier tricks.
If you can send me your complete notebook (with necessary data files) I can take a look at potential performance issues.
@atasattari is this still an active issue?
@oschulz The issue is resolved. It was mainly slowness and a lack of log reports for debugging. Turned out to be repetitive integral calculations to find the PDF for custom models in the likelihood. This situation should be common for unbinned likelihood studies. Would it be reasonable to add a similar example to the tutorial? I can provide my code for reference if needed.
Would it be reasonable to add a similar example to the tutorial? I can provide my code for reference if needed.
Thanks you, yes. We we thinking about adding more examples/tutorials anyway.