libSGM
libSGM copied to clipboard
Is there any plan to increase the disparity size to 256?
Hello,
I am interested in increasing the disparity search to 256. I see that scan_cost and winner_takes_all are the two function that would probably need modification to get higher disparity size working. But the way 64 and 128 disparity search are implemented it seems 256 would not be straight forward extension to 128 version. Any suggestion or pointer to how to do it?
Thanks,
Any updates on this issue?
I was able to make it to 256 by including them in the templates
Hi, As @mohanen mentioned, libSGM with 256 disparity can be made by template specialization. Here is an example modification .
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 8b76bfb..a0db0cf 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -6,7 +6,7 @@ set(CMAKE_CXX_EXTENSIONS OFF)
set(CUDA_ARCH "-arch=sm_50" CACHE STRING "Value of the NVCC -arch option.")
option(ENABLE_ZED_DEMO "Build a Demo using ZED Camera" OFF)
-option(ENABLE_SAMPLES "Build samples" OFF)
+option(ENABLE_SAMPLES "Build samples" ON)
option(ENABLE_TESTS "Test library" OFF)
option(LIBSGM_SHARED "Build a shared library" OFF)
diff --git a/sample/movie/stereosgm_movie.cpp b/sample/movie/stereosgm_movie.cpp
index e087a62..083d4b2 100644
--- a/sample/movie/stereosgm_movie.cpp
+++ b/sample/movie/stereosgm_movie.cpp
@@ -65,12 +65,12 @@ int main(int argc, char* argv[])
cv::Mat I1 = cv::imread(format_string(argv[1], first_frame), -1);
cv::Mat I2 = cv::imread(format_string(argv[2], first_frame), -1);
- const int disp_size = argc >= 4 ? std::stoi(argv[3]) : 128;
+ const int disp_size = argc >= 4 ? std::stoi(argv[3]) : 256;
ASSERT_MSG(!I1.empty() && !I2.empty(), "imread failed.");
ASSERT_MSG(I1.size() == I2.size() && I1.type() == I2.type(), "input images must be same size and type.");
ASSERT_MSG(I1.type() == CV_8U || I1.type() == CV_16U, "input image format must be CV_8U or CV_16U.");
- ASSERT_MSG(disp_size == 64 || disp_size == 128, "disparity size must be 64 or 128.");
+ //ASSERT_MSG(disp_size == 64 || disp_size == 128, "disparity size must be 64 or 128.");
const int width = I1.cols;
const int height = I1.rows;
diff --git a/src/horizontal_path_aggregation.cu b/src/horizontal_path_aggregation.cu
index e42bb47..58f9a1a 100644
--- a/src/horizontal_path_aggregation.cu
+++ b/src/horizontal_path_aggregation.cu
@@ -218,6 +218,16 @@ template void enqueue_aggregate_left2right_path<128u>(
unsigned int p2,
cudaStream_t stream);
+template void enqueue_aggregate_left2right_path<256u>(
+ cost_type *dest,
+ const feature_type *left,
+ const feature_type *right,
+ int width,
+ int height,
+ unsigned int p1,
+ unsigned int p2,
+ cudaStream_t stream);
+
template void enqueue_aggregate_right2left_path<64u>(
cost_type *dest,
const feature_type *left,
@@ -238,5 +248,15 @@ template void enqueue_aggregate_right2left_path<128u>(
unsigned int p2,
cudaStream_t stream);
+template void enqueue_aggregate_right2left_path<256u>(
+ cost_type *dest,
+ const feature_type *left,
+ const feature_type *right,
+ int width,
+ int height,
+ unsigned int p1,
+ unsigned int p2,
+ cudaStream_t stream);
+
}
}
diff --git a/src/oblique_path_aggregation.cu b/src/oblique_path_aggregation.cu
index 10146f1..ead5965 100644
--- a/src/oblique_path_aggregation.cu
+++ b/src/oblique_path_aggregation.cu
@@ -215,6 +215,16 @@ template void enqueue_aggregate_upleft2downright_path<128u>(
unsigned int p2,
cudaStream_t stream);
+template void enqueue_aggregate_upleft2downright_path<256u>(
+ cost_type *dest,
+ const feature_type *left,
+ const feature_type *right,
+ int width,
+ int height,
+ unsigned int p1,
+ unsigned int p2,
+ cudaStream_t stream);
+
template void enqueue_aggregate_upright2downleft_path<64u>(
cost_type *dest,
const feature_type *left,
@@ -235,6 +245,16 @@ template void enqueue_aggregate_upright2downleft_path<128u>(
unsigned int p2,
cudaStream_t stream);
+template void enqueue_aggregate_upright2downleft_path<256u>(
+ cost_type *dest,
+ const feature_type *left,
+ const feature_type *right,
+ int width,
+ int height,
+ unsigned int p1,
+ unsigned int p2,
+ cudaStream_t stream);
+
template void enqueue_aggregate_downright2upleft_path<64u>(
cost_type *dest,
const feature_type *left,
@@ -255,6 +275,16 @@ template void enqueue_aggregate_downright2upleft_path<128u>(
unsigned int p2,
cudaStream_t stream);
+template void enqueue_aggregate_downright2upleft_path<256u>(
+ cost_type *dest,
+ const feature_type *left,
+ const feature_type *right,
+ int width,
+ int height,
+ unsigned int p1,
+ unsigned int p2,
+ cudaStream_t stream);
+
template void enqueue_aggregate_downleft2upright_path<64u>(
cost_type *dest,
const feature_type *left,
@@ -275,5 +305,15 @@ template void enqueue_aggregate_downleft2upright_path<128u>(
unsigned int p2,
cudaStream_t stream);
+template void enqueue_aggregate_downleft2upright_path<256u>(
+ cost_type *dest,
+ const feature_type *left,
+ const feature_type *right,
+ int width,
+ int height,
+ unsigned int p1,
+ unsigned int p2,
+ cudaStream_t stream);
+
}
}
diff --git a/src/path_aggregation.cu b/src/path_aggregation.cu
index de713de..2b1fcf3 100644
--- a/src/path_aggregation.cu
+++ b/src/path_aggregation.cu
@@ -89,5 +89,6 @@ void PathAggregation<MAX_DISPARITY>::enqueue(
template class PathAggregation< 64>;
template class PathAggregation<128>;
+template class PathAggregation<256>;
}
diff --git a/src/sgm.cu b/src/sgm.cu
index de38618..6155e04 100644
--- a/src/sgm.cu
+++ b/src/sgm.cu
@@ -141,7 +141,9 @@ void SemiGlobalMatching<T, MAX_DISPARITY>::enqueue(
template class SemiGlobalMatching<uint8_t, 64>;
template class SemiGlobalMatching<uint8_t, 128>;
+template class SemiGlobalMatching<uint8_t, 256>;
template class SemiGlobalMatching<uint16_t, 64>;
template class SemiGlobalMatching<uint16_t, 128>;
+template class SemiGlobalMatching<uint16_t, 256>;
}
diff --git a/src/stereo_sgm.cpp b/src/stereo_sgm.cpp
index 25cfeb8..7b6b3ad 100644
--- a/src/stereo_sgm.cpp
+++ b/src/stereo_sgm.cpp
@@ -62,10 +62,14 @@ namespace sgm {
sgm_engine = new SemiGlobalMatchingImpl<uint8_t, 64>();
else if (input_depth_bits_ == 8 && disparity_size_ == 128)
sgm_engine = new SemiGlobalMatchingImpl<uint8_t, 128>();
+ else if (input_depth_bits_ == 8 && disparity_size_ == 256)
+ sgm_engine = new SemiGlobalMatchingImpl<uint8_t, 128>();
else if (input_depth_bits_ == 16 && disparity_size_ == 64)
sgm_engine = new SemiGlobalMatchingImpl<uint16_t, 64>();
else if (input_depth_bits_ == 16 && disparity_size_ == 128)
sgm_engine = new SemiGlobalMatchingImpl<uint16_t, 128>();
+ else if (input_depth_bits_ == 16 && disparity_size_ == 256)
+ sgm_engine = new SemiGlobalMatchingImpl<uint16_t, 256>();
else
throw std::logic_error("depth bits must be 8 or 16, and disparity size must be 64 or 128");
@@ -125,7 +129,7 @@ namespace sgm {
width_ = height_ = input_depth_bits_ = output_depth_bits_ = disparity_size_ = 0;
throw std::logic_error("depth bits must be 8 or 16");
}
- if (disparity_size_ != 64 && disparity_size_ != 128) {
+ if (disparity_size_ != 64 && disparity_size_ != 128 && disparity_size_ != 256) {
width_ = height_ = input_depth_bits_ = output_depth_bits_ = disparity_size_ = 0;
throw std::logic_error("disparity size must be 64 or 128");
}
diff --git a/src/vertical_path_aggregation.cu b/src/vertical_path_aggregation.cu
index 08fc5ec..faeb7bb 100644
--- a/src/vertical_path_aggregation.cu
+++ b/src/vertical_path_aggregation.cu
@@ -172,6 +172,16 @@ template void enqueue_aggregate_up2down_path<128u>(
unsigned int p2,
cudaStream_t stream);
+template void enqueue_aggregate_up2down_path<256u>(
+ cost_type *dest,
+ const feature_type *left,
+ const feature_type *right,
+ int width,
+ int height,
+ unsigned int p1,
+ unsigned int p2,
+ cudaStream_t stream);
+
template void enqueue_aggregate_down2up_path<64u>(
cost_type *dest,
const feature_type *left,
@@ -192,5 +202,15 @@ template void enqueue_aggregate_down2up_path<128u>(
unsigned int p2,
cudaStream_t stream);
+template void enqueue_aggregate_down2up_path<256u>(
+ cost_type *dest,
+ const feature_type *left,
+ const feature_type *right,
+ int width,
+ int height,
+ unsigned int p1,
+ unsigned int p2,
+ cudaStream_t stream);
+
}
}
diff --git a/src/winner_takes_all.cu b/src/winner_takes_all.cu
index 42e2132..e47e450 100644
--- a/src/winner_takes_all.cu
+++ b/src/winner_takes_all.cu
@@ -352,5 +352,6 @@ void WinnerTakesAll<MAX_DISPARITY>::enqueue(
template class WinnerTakesAll< 64>;
template class WinnerTakesAll<128>;
+template class WinnerTakesAll<256>;
}
Hi @atakagi-fixstars , as I commented earlier, I also did the same thing for 256. But while building the lib I got some warnings mentioned below, should I be worried?
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=1, MAX_DISPARITY=256U]"
(176): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_left2right_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(221): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=-1, MAX_DISPARITY=256U]"
(197): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_right2left_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(253): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=1, MAX_DISPARITY=256U]"
(176): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_left2right_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(221): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=-1, MAX_DISPARITY=256U]"
(197): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_right2left_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(253): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=1, MAX_DISPARITY=256U]"
(176): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_left2right_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(221): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=-1, MAX_DISPARITY=256U]"
(197): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_right2left_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(253): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=1, MAX_DISPARITY=256U]"
(176): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_left2right_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(221): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=-1, MAX_DISPARITY=256U]"
(197): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_right2left_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(253): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=1, MAX_DISPARITY=256U]"
(176): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_left2right_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(221): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=-1, MAX_DISPARITY=256U]"
(197): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_right2left_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(253): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=1, MAX_DISPARITY=256U]"
(176): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_left2right_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(221): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=-1, MAX_DISPARITY=256U]"
(197): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_right2left_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(253): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=1, MAX_DISPARITY=256U]"
(176): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_left2right_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(221): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=-1, MAX_DISPARITY=256U]"
(197): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_right2left_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(253): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=1, MAX_DISPARITY=256U]"
(176): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_left2right_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(221): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=-1, MAX_DISPARITY=256U]"
(197): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_right2left_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(253): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=1, MAX_DISPARITY=256U]"
(176): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_left2right_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(221): here
/home/mohanen/fixstars/horizontal_path_aggregation.cu(60): warning: shift count is too large
detected during:
instantiation of "void sgm::path_aggregation::aggregate_horizontal_path_kernel<DIRECTION,MAX_DISPARITY>(uint8_t *, const sgm::feature_type *, const sgm::feature_type *, unsigned int, unsigned int, unsigned int, unsigned int) [with DIRECTION=-1, MAX_DISPARITY=256U]"
(197): here
instantiation of "void sgm::path_aggregation::enqueue_aggregate_right2left_path<MAX_DISPARITY>(sgm::cost_type *, const sgm::feature_type *, const sgm::feature_type *, size_t, size_t, unsigned int, unsigned int, cudaStream_t) [with MAX_DISPARITY=256U]"
(253): here
Hi @mohanen, thank you for the information.
I got same warnings, and below is the corresponding line.
const unsigned int shfl_mask =
((1u << SUBGROUP_SIZE) - 1u) << (group_id * SUBGROUP_SIZE);
When MAX_DISPARITY
is 256, the SUBGROUP_SIZE
becomes 32, so 1u << SUBGROUP_SIZE
overflows.
This needs to be fixed if we want 256 disparity.
I came up with some workarounds,
- use 64bit integer
Use
((1ull << SUBGROUP_SIZE) - 1u)
, but this works whenSUBGROUP_SIZE
is less than 32. - reduce
SUBGROUP_SIZE
SUBGROUP_SIZE
is calculated byMAX_DISPARITY / DP_BLOCK_SIZE
, so largerDP_BLOCK_SIZE
(ex. 16u) reducesSUBGROUP_SIZE
to 16.
By the way, we calculate (1u << SUBGROUP_SIZE) - 1u)
in order to get mask bits.
We expect mask bits to be (1<<32)-1 = 0xffffffff
.
When 1u<<32
overflows, 1u<<32
becomes 0u
, and 0u-1u
again overflows and finally becomes 0xffffffff
.
So, even if we ignore overflow, we get expected result accidentally.
Hi, @atakagi-fixstars @mohanen
According to the above instructions, I could build the libSGM successfully, but I could not get the right result using this new lib(disp_size=256). Later, I found a problem in the instructions. You could get this from the following picture.
After changing the param 128 to 256, I run the stereo_test program and it crashed(set disp_size to 256). I got following message in the cmd.
I debug the program, I found the function median_filter() threw an exception.
Hi, @Micalson
I found a problem in the instructions. You could get this from the following picture.
Thank you for pointing it out. It should be SemiGlobalMatchingImpl<uint8_t, 256>
.
I debug the program, I found the function median_filter() threw an exception.
Could you give me the pair of input images or image size?
@atakagi-fixstars You could download my test images from this. Thank you for your reply.
@atakagi-fixstars I also tested other images(Aloe) whose size is 1282x1110, setting disp_size to 256, and the program still crashed.
@Micalson Thank you for the test images. I'll check.
@atakagi-fixstars You are welcome. I am looking forward to your revision. Regards,
@atakagi-fixstars Any updates about this issue? I plan to use the library in my project, so could you tell how long it will take to fix this? I will be using the libSGM for a long time, just in time for the performance tests of the library. Regards,
Hi, @Micalson
I tested with GTX 750Ti and got same error.
I checked the cause and it was due to insufficient device memory. The libSGM use [width x height x disparity_size x 8(path)] byets of memory at cost aggregation. In "left(right).bmp" it was about 2.4GBytes, and 2.8GBytes in "Aloe_left(right).png". These requests exceed total memory of GTX 750Ti(2GB).
When I tested with GTX 1060(6GB), the error didn't occur. So, immediate solution is to use GPU with sufficient device memory.
It may be better to add memory check in libSGM.
Regards,
@atakagi-fixstars I got it. Could I change the disparity_size to 200? If disparity_size is set to 200, the device memory required is 1.9GB.
Regards,
@Micalson
Could I change the disparity_size to 200?
Unfortunately no. In current implementation, disparity size must be a power of 2. Otherwise build will fail.
Regards,
@atakagi-fixstars I am using Jetson TX2, and Is there any other way I can use disp_size(256) on this board? Thanks.
Regards,
@Micalson If this is okay with you, reduce number of paths to 4. It requires half of memory. The modification is below. https://github.com/fixstars/libSGM/issues/35#issuecomment-483494807
Regards,
@atakagi-fixstars According to your instruction, I have solved the problem. I am going to use the libSGM all the time, and feedback related issues. Thanks again.
Regards.