Baysor
Baysor copied to clipboard
Error estimating boundary polygons
I've gotten the same crash at the same point in processing twice which prevented the export of a GeoJSON file or cell shape plot. The run command and log output below:
Run parameters:
./Baysor/bin/baysor run -g gene -x global_x -y global_y -z global_z --plot --save-polygons GeoJSON -o ./baysor_output/[project name] ./data/[input file] :cell
Log output:
[15:15:00] Info: (2021-10-28) Run Baysor v0.5.0
[15:15:00] Info: Loading data...
[15:15:10] Info: Loaded 5347668 transcripts
[15:15:12] Info: Estimating noise level
[15:15:19] Warning: Only k-nn random field is supported for 3D data
└ Baysor /home/vpetukhov/.julia/dev/Baysor/src/data_processing/triangulation.jl:30
[15:16:57] Info: Done
[15:16:57] Warning: Only k-nn random field is supported for 3D data
└ Baysor /home/vpetukhov/.julia/dev/Baysor/src/data_processing/triangulation.jl:30
[15:18:09] Info: Clustering molecules...
[15:18:09] Warning: Only k-nn random field is supported for 3D data
└ Baysor /home/vpetukhov/.julia/dev/Baysor/src/data_processing/triangulation.jl:30
[15:25:13] Info: Algorithm stopped after 263 iterations. Error: 0.00955. Converged: true.
[15:25:15] Info: Done
[15:25:16] Info: Initializing algorithm. Scale: 3.2142932583089294, scale std: 1.0218244756990273, initial #components: 3565112, #molecules: 5347668.
[15:27:19] Info: Using 3D coordinates
[20:40:46] Info: Processing complete.
[20:42:40] Info: Estimating local colors
[20:48:34] Info: Saving results to ./baysor_output/[project name]
[20:49:12] Info: Plot diagnostics
[20:49:50] Info: Estimating boundary polygons
[20:55:50] Error: InexactError(:trunc, UInt16, 65540)
|
| throw_inexacterror(f::Symbol, #unused#::Type{UInt16}, val::Int64) at boot.jl:602
| checked_trunc_uint at boot.jl:632 [inlined]
| toUInt16 at boot.jl:705 [inlined]
| UInt16 at boot.jl:755 [inlined]
| convert at number.jl:7 [inlined]
| setindex! at array.jl:841 [inlined]
| find_grid_point_labels_kde(pos_data::Matrix{Float64}, cell_labels::Vector{Int64}, min_x::Vector{Float64}, max_x::Vector{Float64}; grid_step::Float64, bandwidth::Float64, dens_threshold::Float64, min_molecules_per_cell::Int64, verbose::Bool) at boundary_estimation.jl:208
| (::Baysor.var"#find_grid_point_labels_kde##kw")(::NamedTuple{(:grid_step, :bandwidth), Tuple{Float64, Float64}}, ::typeof(Baysor.find_grid_point_labels_kde), pos_data::Matrix{Float64}, cell_labels::Vector{Int64}, min_x::Vector{Float64}, max_x::Vector{Float64}) at boundary_estimation.jl:159
| boundary_polygons(pos_data::Matrix{Float64}, cell_labels::Vector{Int64}; min_x::Nothing, max_x::Nothing, grid_step::Float64, min_border_length::Int64, shape_method::Symbol, max_dev::Float64, bandwidth::Float64, exclude_labels::Vector{Int64}, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) at boundary_estimation.jl:263
| boundary_polygons at boundary_estimation.jl:254 [inlined]
| #boundary_polygons#217 at boundary_estimation.jl:248 [inlined]
| (::Baysor.var"#boundary_polygons##kw")(::NamedTuple{(:grid_step, :bandwidth), Tuple{Float64, Float64}}, ::typeof(Baysor.boundary_polygons), spatial_df::DataFrames.DataFrame, args::Vector{Int64}) at boundary_estimation.jl:248
| (::Baysor.var"#333#337"{DataFrames.DataFrame, Vector{Int64}, Dict{String, Any}, Float64})(mask::BitVector) at main.jl:215
| #47 at ProgressMeter.jl:991 [inlined]
| iterate at generator.jl:47 [inlined]
| _collect(c::Vector{BitVector}, itr::Base.Generator{Vector{BitVector}, ProgressMeter.var"#47#50"{Distributed.RemoteChannel{Channel{Bool}}, Baysor.var"#333#337"{DataFrames.DataFrame, Vector{Int64}, Dict{String, Any}, Float64}}}, #unused#::Base.EltypeUnknown, isz::Base.HasShape{1}) at array.jl:691
| collect_similar(cont::Vector{BitVector}, itr::Base.Generator{Vector{BitVector}, ProgressMeter.var"#47#50"{Distributed.RemoteChannel{Channel{Bool}}, Baysor.var"#333#337"{DataFrames.DataFrame, Vector{Int64}, Dict{String, Any}, Float64}}}) at array.jl:606
| map(f::Function, A::Vector{BitVector}) at abstractarray.jl:2294
| macro expansion at ProgressMeter.jl:990 [inlined]
| macro expansion at task.jl:382 [inlined]
| macro expansion at ProgressMeter.jl:989 [inlined]
| macro expansion at task.jl:382 [inlined]
| progress_map(::Function, ::Vararg{Any, N} where N; mapfun::Function, progress::ProgressMeter.Progress, channel_bufflen::Int64, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) at ProgressMeter.jl:982
| progress_map(::Function, ::Vararg{Any, N} where N) at ProgressMeter.jl:978
| plot_transcript_assignment_panel(df_res::DataFrames.DataFrame, assignment::Vector{Int64}, args::Dict{String, Any}; clusters::Vector{Int64}, prior_polygons::Vector{Matrix{Float64}}, gene_colors::Vector{ColorTypes.Lab{Float64}}) at main.jl:215
| (::Baysor.var"#plot_transcript_assignment_panel##kw")(::NamedTuple{(:clusters, :prior_polygons, :gene_colors), Tuple{Vector{Int64}, Vector{Matrix{Float64}}, Vector{ColorTypes.Lab{Float64}}}}, ::typeof(Baysor.plot_transcript_assignment_panel), df_res::DataFrames.DataFrame, assignment::Vector{Int64}, args::Dict{String, Any}) at main.jl:200
| save_segmentation_results(bm_data::Baysor.BmmData{3}, gene_names::Vector{String}, args::Dict{String, Any}; mol_clusts::NamedTuple{(:exprs, :assignment, :diffs, :assignment_probs, :change_fracs), Tuple{Matrix{Float64}, Vector{Int64}, Vector{Float64}, Matrix{Float64}, Vector{Float64}}}, comp_segs::Nothing, prior_polygons::Vector{Matrix{Float64}}) at main.jl:406
| (::Baysor.var"#save_segmentation_results##kw")(::NamedTuple{(:mol_clusts, :comp_segs, :prior_polygons), Tuple{NamedTuple{(:exprs, :assignment, :diffs, :assignment_probs, :change_fracs), Tuple{Matrix{Float64}, Vector{Int64}, Vector{Float64}, Matrix{Float64}, Vector{Float64}}}, Nothing, Vector{Matrix{Float64}}}}, ::typeof(Baysor.save_segmentation_results), bm_data::Baysor.BmmData{3}, gene_names::Vector{String}, args::Dict{String, Any}) at main.jl:389
| run_cli_main(args::Vector{String}) at main.jl:470
| run_cli(args::Vector{String}) at common.jl:118
| run_cli at common.jl:103 [inlined]
| julia_main at common.jl:139 [inlined]
| julia_main() at none:36
└ Baysor /home/vpetukhov/.julia/dev/Baysor/src/cli/common.jl:132
Environment is Ubuntu 20.04.2, Baysor was running in a screen 4.08.00 session.
I need to check it more, but the first guess is that you're using name :cell
for prior segmentation, but it is also used for the output of Baysor, and there can be some problems with type conversions. I should fix it anyway at some point, but could you please try to rename the prior segmentation column and rerun?
I'll try that and update when the run is complete.
Just finished running again with the column renamed prior_cell, unfortunately got the same error.
I am getting the same error if I am running with -m 50
but not if I run it with -m 3
. Is there any solution to that?
After doing some digging, I think I found the error. The function find_grid_point_labels_kde
(in src/data_processing/boundary_estimation.jl
) can allocate the wrong type for label_mat
.
In find_grid_point_labels_kde
a binary a mask deciding if the cell meets the min_molecules_per_cell
criterion is created and saved in filt_mask
. Further cell_labels
is defined as (0:maximum(cell_labels))
. The filt_mask
is used to subset both coords_per_label
and cell_labels
.
If you have more than 2^16
identified cells, the maximal value of cell_labels
has to be of type UInt32
. However, if a lot of the cells do not meet the min_molecules_per_cell
criterion, the length of both arrays (coords_per_label
and cell_labels
) can be lower than 2^16
after applying the filt_mask
. In Line 172-176 label_mat
is created with type UInt16
if the length of coords_per_label
is smaller than (2^16) - 1
. However, the final value that is stored in label_mat
derives from the values in cell_labels
and can still be of type UInt32
. If this is the case, the assignment in line 208 causes the error.
The fix is easy: Line 172 has to be changed from
if length(coords_per_label) < (2 ^ 16) - 1
to
if maximum(cell_labels) < (2 ^ 16) - 1
This assures that label_mat
is always of the correct type. After changing that line, the program ran as expected for me.