mediapipe icon indicating copy to clipboard operation
mediapipe copied to clipboard

Landscape Segmentation Offset Misalignment for Nvidia GPUs

Open Singulariteehee opened this issue 1 year ago • 5 comments

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

OS Platform and Distribution

Windows 11

MediaPipe Tasks SDK version

0.10.14

Task name (e.g. Image classification, Gesture recognition etc.)

Image Segmenter

Programming Language and version (e.g. C++, Python, Java)

Javascript

Describe the actual behavior

The segmentation output is misaligned for NVidia GPUs

Describe the expected behaviour

The segmentation output for CPU should match GPU for all GPUs

Standalone code/steps you may have used to try to get what you need

I have created a minimal example in CodePen that clearly shows the problem when executed on an NVidia GPU. When executed on my Intel integrated graphics or my old AMD R9 Fury X, the output matches between CPU and GPU. https://codepen.io/Singulariteehee/pen/OJYNoBy The problem is more advanced than a simple offset, as some lines do not suffer from the offset. If you simply try to reverse the offset, you still end up with crummy results because every 8th line gets jagged. Also, it is difficult to know when to reverse the offset because the browser is intent on preventing us from determining the hardware because of privacy concerns.

My previous report simply got closed because I wasn't looking at it, but hopefully I have put enough effort into the report this time that the problem can be properly recognized.

Other info / Complete Logs

No response

Singulariteehee avatar May 21 '24 20:05 Singulariteehee

A_Pen_by_Singulariteehee_-_Google_Chrome_2024-05-21--04-25-18 Here is the result of the CodePen when executed on my RTX 3090.

Singulariteehee avatar May 21 '24 20:05 Singulariteehee

I added an unshifted example diff to the CodePen: image

This result shows that shifting the returned mask down by one pixel makes the problem much less intense, but it is still full of artifacts on basically every line, with some lines being much worse than others.

Singulariteehee avatar May 22 '24 02:05 Singulariteehee

Hi @Singulariteehee,

This behavior might be occurring because Nvidia and AMD use different drivers—Nvidia uses CUDA while AMD uses OpenGL in the backend. As a result, this behavior is anticipated, and there is not much we can do to address this issue.

Thank you!!

kuaashish avatar May 30 '24 06:05 kuaashish

I can somewhat understand that there could be slight differences between implementations, but it doesn't make sense to me that an inferencing output that is offset by an entire texel could be an acceptable driver difference. If inferencing output from an LLM layer were offset by one, it simply wouldn't work at all. In this case, because it is a visual output, being off by one row vaguely appears to be functioning code. I have not seen this offset in pre tasks vision implementations of the landscape model.

I have created an additional Codepen for the regular version of the model (256x256): https://codepen.io/Singulariteehee/pen/qBGRMpq

In this version, the output is a pixel perfect match between the CPU and GPU implementation, even on NVidia cards. image

Singulariteehee avatar May 30 '24 15:05 Singulariteehee

Hi @tyrmullen,

Could you please add some pointer here? There is a pixel match issue between Intel and Nvidia graphics cards. Any additional information you can provide will be very helpful.

Thank you!!

kuaashish avatar Jun 28 '24 07:06 kuaashish