table-transformer icon indicating copy to clipboard operation
table-transformer copied to clipboard

Colab Notebook TSR: functional analysis and obtain final dataframe

Open emigomez opened this issue 2 years ago • 2 comments

Hi!

I was working with the TD and TSR notebooks https://github.com/NielsRogge/Transformers-Tutorials/tree/master/Table%20Transformer, and they work properly for me, but the last step of the TSR pipeline to obtain a data frame is not implemented in these notebooks (I think, this process is called functional analysis in this repo). The postprocessing steps of the TSR pass from the structure to grid cells.

Was anyone capable to obtain well the final data frame for TSR in colab? Taking into account spanning cells and titles.

Regards

emigomez avatar Dec 09 '22 12:12 emigomez

Check this space on Huggingface where you can find a clean implementation of the steps you were missing. The main part can be found in app.py

JaMe76 avatar Dec 11 '22 19:12 JaMe76

Thank you for your response @JaMe76 !!

I have worked with this script for postprocessing before, but I think that some parts are missing. From the results that I have obtained using the notebooks and this app.py functions, I believe that the final data frame of app.py doesn't take into account TSR labels as 'spanning cell'. I show one example below.

TSR results: image image image image image

app.py postprocessing result: image

As you can see in this example, as the postprocessing is not taking into account the 'spanning cells' the result is going to be bad.

Let me know please if I'm doing something bad with this app.py postprocessing, or do you have the same problems

emigomez avatar Dec 11 '22 19:12 emigomez