Sam_S

Results 38 comments of Sam_S

Yes, I think so, I did the training on the raw z location values (without standardizing them) and the prediction works fine. As far as I know, you don't need...

Set the bounding box width and height to 0 mean and 1 standard deviation.

@WeiquanWa , did you manage to get some semblance of bounding boxes or the cross-attention heatmap from the outputs? I cannot interpret the structure of the output attention maps from...

@gwkrsrch, could you give us the code that was used to generate the heatmap visualization in Figure 8 of the DONUT paper?

@developfeng, is the code for the paper "1000fps human segmentation with deep convolutional neural networks" available somewhere?

Looks like the `val/validation` dir name issue is fixed with the current main-branch version of the `datasets` repository. > @polinaeterna @lhoestq Perhaps one way to fix this would be to...

![Screen Shot 2022-09-09 at 10 55 53 AM](https://user-images.githubusercontent.com/13418507/189290053-4a87602b-8bf0-4d97-b08b-df0bad429977.png) So, I've found a way to generate the heatmaps from the cross attentions from the decoder. However, the attention maps correspond to...

It should work for docvqa as well. I've added an example in the notebook I shared above based on the code snippet below. The attention is focused on the answer...

Refer to the **Document VQA Example** section from this notebook. You have to use a resized shape of `[4, 16, 80, 60]` for docvqa task since the final cross-attention feature...