tfjs
tfjs copied to clipboard
"Failed to link vertex and fragment shaders"
System information
- OS Platform and Distribution: Windows 10 Pro x64 21H2
- ~Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:~
- TensorFlow.js installed from: pnpm
- TensorFlow.js version: ^3.18.0
- ~CUDA/cuDNN version:~
Describe the problem After following https://www.youtube.com/watch?v=7gOYpT732ow I'm getting the error
Error: Failed to link vertex and fragment shaders.
FYI, I'm trying to load via TensorFlow.js: https://tfhub.dev/tensorflow/centernet/hourglass_512x512_kpts/1
Provide the exact sequence of commands / steps that you executed before running into the problem
Any other info / logs https://github.com/avi12/lego-ai.js/blob/bb41aa57d238798dc80cbc8861676645cd0e3d19/src/App.svelte#L46-L54
centernet/hourglass_512x512_kpts is quite a large model so unless you have GPU with a lot of memory (you didn't specify your hardware), that is exactly what happens. yes, Failed to link vertex and fragment shaders is somewhat cryptic, but 99.9% of those errors are GPU out-of-memory scenarios.
@vladmandic Interesting, thank you for your information I have Nvidia GTX 1080 Ti 11GB VRAM
@avi12 if you can create a codepen instance, it would help us to reproduce the problem.
btw, i got failures at 12gb and 16gb, success at 24gb. neither centernet nor hourglass backbone are typically that bad, so i'd say there is something weird about this one. but then again, 512 is a pretty high resolution for this type of a model and memory requirements are near-exponential with resolution.
I see, then I'll try using a less memory-hog model Thanks for the info!
What is the closest model that uses a lower resolution which can be run on my GPU?
almost any, i'd say this is an exception
from object detection models published on tfhub and trained on coco dataset, i like efficientdet
and regarding efficientdet, imo but no point going over -d4 variation (d0 is lightest and fastest and least precise, d7 is largest and slowest and most precise)
off-topic - personally, any single model i've seen so far has some blind spots, so i actually like combining results from efficientdet-d4 and centernet (based on lighter resnet-50 backbone) since centernet has a very different architecture
and for a near-realtime execution, i use centernet based on mobilenet-v3 backbone (not on tfhub), i like it even more than typically used yolo-v5 or different nanodet models
I just tried running my own model that's based on CocoSSD, and I got:

Unsigned integer divide by zero
this is definitely not a memory issue, this is something else in the model itself
The model was originally designed to work on Google Colab; I simply used tensorflowjs_converter to attempt to run it on a PWA
Is there a chance that I used tensorflowjs_converter incorrectly to convert the model?
The converter file: https://colab.research.google.com/drive/1L0_iDuJT21jjdlFZibabAC2VTaLn9Xtg?usp=sharing
no issues executing your model using tfjs 3.20.0:
tf.setBackend('webgl');
await tf.ready();
const model = await tf.loadGraphModel(modelUrl);
const input = tf.randomUniform([1, 256, 256, 3], 0, 255, 'int32');
const res = await model.executeAsync(input);
What would be the correct way to preprocess the image and then, after the model.executeAsync(), extract (x, y) coordinates?
As a reference image, look into lego-ai.js/public/sample.jpg
there is no one way, it depends on the model. your model has dynamic width/height, so any would do. but even then there are probably some limitations like it should be in specific ratio (square?) or using power of two or who knows.
and then you need to know what model expects as normalized input. since this model is int32 based, most likely expected pixel values are in 0..255 range. but if it were float32 based, it could be -1..1, 0..1, 0..255, etc. - you simply need to know.
and post-processing is even worse. your model returns 8 tensors, some very simple and some huge. i have no clue what-is-what. but if its object detection, most commonly you'd need to decode boxes and then run through some non-maximum function to discard most of them and keep only valuable ones.
it gets even worse if model is strided (which is quite common) since then you need to build anchors for each stride and then reconstruct boxes.
all-in-all, just executing a model with unknown inputs and unknown outputs is not very useful.
example of a pre&post processing for object detection of a non-strided model: https://github.com/vladmandic/mb3-centernet example of a pre&post processing for object detection of a strided model: https://github.com/vladmandic/nanodet
FYI, this is the Python code that interacts with the model
@vladmandic After inspecting the non-strided model's code, I came up with this code:
const buffer = tf.browser.fromPixels(elImage);
const [width, height] = buffer.shape;
const resize = tf.image.resizeBilinear(buffer, [width, height]);
const cast = resize.cast("float32");
const expand = cast.expandDims(0);
const tensor = expand;
const tensorImage = {
tensor,
inputShape: [buffer.shape[1], buffer.shape[0]],
outputShape: tensor.shape,
size: buffer.size,
};
Except, it throws off the following:
Error: The dtype of dict['input_tensor'] provided in model.execute(dict) must be int32, but was float32
Maybe try:
resize.cast('int32')
instead of the current cast to "float32"?
I'm confused by your tensorImage object. I suppose your model may need that, but typically you just pass a Tensor4D stack of images, not a newly created object. IOW, you probably should pass expand directly to the model. The tensorImage object looks suspiciously like a partial Tensor anyway.
IIRC, tf.browser.fromPixels() returns an int8 buffer, with values in [0, 255], and it sounds like your model wants int32, which would also be [0, 255]. Just as an aside, if you go to float32, which doesn't sound like what you need here, you'd probably also convert the [0,255] range to [0,1] with a tf.div(255).
On Wed, Sep 14, 2022 at 11:22 AM avi12 @.***> wrote:
@vladmandic https://github.com/vladmandic After inspecting the non-strided model's code, I came up with this code https://github.com/avi12/lego-ai.js/blob/b6b9e48923de83ccd293544e19a8e7ba9469c591/src/App.svelte#L46-L57 :
const buffer = tf.browser.fromPixels(elImage);const [width, height] = buffer.shape;const resize = tf.image.resizeBilinear(buffer, [width, height]);const cast = resize.cast("float32");const expand = cast.expandDims(0);const tensor = expand;const tensorImage = { tensor, inputShape: [buffer.shape[1], buffer.shape[0]], outputShape: tensor.shape, size: buffer.size,};
Except, it throws off the following:
Error: The dtype of dict['input_tensor'] provided in model.execute(dict) must be int32, but was float32
— Reply to this email directly, view it on GitHub https://github.com/tensorflow/tfjs/issues/6829#issuecomment-1247143468, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEQE2OISWJQ55HCBO5Z4QLV6IJVLANCNFSM6AAAAAAQLXW6RM . You are receiving this because you are subscribed to this thread.Message ID: @.***>
@danwexler
I'm confused by your tensorImage object
This is based on vladmandic/mb3-centernet, as I have no experience with TensorFlow.js
Ah, that's just an internal object Vlad is using, and has nothing to do with TFJS. Using a similar object in your code is up to you, but it shouldn't be passed to the prediction function. Regardless, you seem to be giving your model a float32 Tensor and it wants an int32, so the cast('int32') recommendation I made may help. As Vlad says, you need to understand the models you are working with and what they expect as input in order to use them. Traditionally, most TFJS models expect a Tensor4D as input that contains a stack of (or single) image as a float32 in the range of [0,1]. However your model seems to want 'int32' according to your errors, which means all bets are off!
On Wed, Sep 14, 2022 at 11:57 AM avi12 @.***> wrote:
@danwexler https://github.com/danwexler
I'm confused by your tensorImage object
This is based on vladmandic/mb3-centernet https://github.com/vladmandic/mb3-centernet/blob/6b743ba93a7a7612d6f468b60b6476204e5e0ff5/mb3-centernet.js#L63, as I have no experience with TensorFlow.js
— Reply to this email directly, view it on GitHub https://github.com/tensorflow/tfjs/issues/6829#issuecomment-1247178960, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEQE2M5Q6FYW7W7GZEYAJDV6INYLANCNFSM6AAAAAAQLXW6RM . You are receiving this because you were mentioned.Message ID: @.***>
you'd probably also convert the [0,255] range to [0,1] with a tf.div(255).
Interesting, by adding
- const buffer = tf.browser.fromPixels(elImage);
+ const buffer = tf.browser.fromPixels(elImage).div(255);
and having at the end
const outputs = await model.executeAsync(tensorImage.tensor);
for (const prediction of outputs) {
prediction.print();
}
I got:

Now the next question is how am I supposed to extract (x, y) coordinates
@danwexler What do I do next?
@vladmandic
example of a pre&post processing for object detection of a non-strided model: https://github.com/vladmandic/mb3-centernet > example of a pre&post processing for object detection of a strided model: https://github.com/vladmandic/nanodet
I'm not familiar with the concept of "stride" Do you have a good resource, preferably a video resource, that I can learn from?
you have a model that returns 8 tensors with unknown structures - unless you know what they are, there is no way to know how to process them. purely based on their sizes, i'd guess that at least some are heatmaps and that is a story of its own - a lot of processing needs to go there to make any sense out of the outputs. and going over tfjs course here is far beyond the scope of an issue which was reported.
I think I can get the required info regarding the inputs, based on the Python model How can I keep in touch so I can share Drive files with you?