Sam_S comments

Results 38 comments of


                                            Sam_S

question about the distance information in KITTI dataset

Yes, I think so, I did the training on the raw z location values (without standardizing them) and the prediction works fine. As far as I know, you don't need...

question about the distance information in KITTI dataset

Set the bounding box width and height to 0 mean and 1 standard deviation.

How to get the bounding boxes of the extracted entities?

@WeiquanWa , did you manage to get some semblance of bounding boxes or the cross-attention heatmap from the outputs? I cannot interpret the structure of the output attention maps from...

How to get the bounding boxes of the extracted entities?

@gwkrsrch, could you give us the code that was used to generate the heatmap visualization in Figure 8 of the DONUT paper?

How to get the bounding boxes of the extracted entities?

I've found updates at #45

How do you get a foreground image without background

@developfeng, is the code for the paper "1000fps human segmentation with deep convolutional neural networks" available somewhere?

load_dataset method returns Unknown split "validation" even if this dir exists

Looks like the `val/validation` dir name issue is fixed with the current main-branch version of the `datasets` repository. > @polinaeterna @lhoestq Perhaps one way to fix this would be to...

Add bounding boxes coordinates in predictions

![Screen Shot 2022-09-09 at 10 55 53 AM](https://user-images.githubusercontent.com/13418507/189290053-4a87602b-8bf0-4d97-b08b-df0bad429977.png) So, I've found a way to generate the heatmaps from the cross attentions from the decoder. However, the attention maps correspond to...

Add bounding boxes coordinates in predictions

It should work for docvqa as well. I've added an example in the notebook I shared above based on the code snippet below. The attention is focused on the answer...

Add bounding boxes coordinates in predictions

Refer to the **Document VQA Example** section from this notebook. You have to use a resized shape of `[4, 16, 80, 60]` for docvqa task since the final cross-attention feature...