CoVA-Web-Object-Detection
CoVA-Web-Object-Detection copied to clipboard
Parsing HTML
I have trained the model on my own and now would like to use it for inference.
I am wondering how you parsed the HTML to get all the relevant nodes of the DOM tree. This is how I implemented it based on what I assume you have used, but the eventual results of the model are way off on new data.
for c, url in enumerate(urls):
#selenium webdriver
driver.get(url)
driver.save_screenshot(os.path.join("test_data", "imgs", f"{c}.png"))
locations = []
ids = driver.find_elements_by_xpath('//*[@id]')
for ii in ids:
#catch stale elements????
try:
if ii.is_displayed():
location_dic = {}
location_dic.update(ii.location)
location_dic.update(ii.size)
#check if bounding box in screenshot
if all([i < 1280 for i in location_dic.values()]):
locations.append(location_dic)
except:
continue
#save bounding boxes in csv
bbox_df = pd.DataFrame(locations)
print(len(bbox_df))
for column in bbox_df.columns:
bbox_df[column] = bbox_df[column].astype(float)
bbox_df.to_csv(os.path.join("test_data", "bboxes", f"{c}.csv"), sep = ",", index = False)
What does this csv look like? Have you checked the labels are correct for this dataset? You can plot your boxes on the new images and check that the ground truth is accurate first
Sorry for the late reply. I have trained the model with the data you published online, I have not annotated my own data. I would like to use the trained model on my own unseen data. With the code above I try to get the image and bounding boxes for new data.
I am using this as example:
https://www.ebay.com/itm/175033829853?var=474192111960&_trkparms=%26rpp_cid%3D625734a9ee2e8709e30c81a0%26rpp_icid%3D625734a9ee2e8709e30c819f&_trkparms=pageci%3Aa6596fb6-770e-11ed-b4ba-ae5aa28bb675%7Cparentrq%3Af2653e261840ab9708d5eb4dfffee96c%7Ciid%3A1
Bounding boxes look like:
| x | y | height | width |
|---|---|---|---|
| 0.0 | 0.0 | 31.0 | 1259.0 |
| 0.0 | 0.0 | 0.0 | 1259.0 |
| 5.0 | 0.0 | 126.0 | 1249.0 |
| 5.0 | 0.0 | 97.0 | 1249.0 |
| 5.0 | 44.0 | 48.0 | 117.0 |
| 5.0 | -3.0 | 200.0 | 250.0 |
| 147.0 | 46.0 | 43.0 | 90.0 |
| 147.0 | 44.0 | 43.0 | 90.0 |
| 215.0 | 64.0 | 7.0 | 13.0 |
| 244.0 | 47.0 | 42.0 | 1007.0 |
| 244.0 | 47.0 | 42.0 | 599.0 |
| 247.0 | 49.0 | 38.0 | 590.0 |
| 843.0 | 47.0 | 42.0 | 160.0 |
| 841.0 | 47.0 | 42.0 | 162.0 |
| 841.0 | 49.0 | 38.0 | 160.0 |
| 1008.0 | 47.0 | 42.0 | 168.0 |
| 1176.0 | 47.0 | 42.0 | 75.0 |
| 1176.0 | 48.0 | 40.0 | 75.0 |
| 5.0 | 0.0 | 31.0 | 1249.0 |
| 5.0 | 2.0 | 25.0 | 1249.0 |
| 5.0 | 2.0 | 25.0 | 139.0 |
| -5.0 | 2.0 | 25.0 | 139.0 |
| 67.0 | 7.0 | 16.0 | 61.0 |
| 144.0 | 9.0 | 12.0 | 85.0 |
| 228.0 | 9.0 | 12.0 | 94.0 |
| 322.0 | 9.0 | 12.0 | 111.0 |
| 925.0 | 1.0 | 28.0 | 329.0 |
| 924.0 | 6.0 | 16.0 | 46.0 |
| 969.0 | 3.0 | 24.0 | 88.0 |
| 1062.0 | 3.0 | 24.0 | 82.0 |
| 1150.0 | 1.0 | 28.0 | 56.0 |
| 1168.0 | 5.0 | 22.0 | 18.0 |
| 1206.0 | 1.0 | 28.0 | 45.0 |
| 5.0 | 97.0 | 0.0 | 1249.0 |
| 5.0 | 102.0 | 24.0 | 1249.0 |
| 251.0 | 102.0 | 1.0 | 1.0 |
| 1086.0 | 102.0 | 16.0 | 38.0 |
| 5.0 | 146.0 | 0.0 | 1249.0 |
| 5.0 | 146.0 | 1194.0 | 1247.0 |
| 5.0 | 146.0 | 0.0 | 1247.0 |
| 5.0 | 146.0 | 269.0 | 1247.0 |
| 14.0 | 210.0 | 155.0 | 1237.0 |
| 30.0 | 210.0 | 155.0 | 1205.0 |
| 5.0 | 435.0 | 905.0 | 1247.0 |
| 5.0 | 435.0 | 508.0 | 481.0 |
| 5.0 | 435.0 | 422.0 | 61.0 |
| 66.0 | 435.0 | 402.0 | 402.0 |
| 205.0 | 788.0 | 31.0 | 124.0 |
| 952.0 | 435.0 | 456.0 | 300.0 |
| 968.0 | 451.0 | 18.0 | 159.0 |
| 964.0 | 689.0 | 20.0 | 139.0 |
| 469.0 | 435.0 | 905.0 | 483.0 |
| 486.0 | 492.0 | 848.0 | 451.0 |
| 594.0 | 537.0 | 21.0 | 290.0 |
| 0.0 | 0.0 | 0.0 | 0.0 |
| 594.0 | 571.0 | 21.0 | 290.0 |
| 594.0 | 605.0 | 21.0 | 290.0 |
| 594.0 | 639.0 | 33.0 | 44.0 |
| 647.0 | 638.0 | 17.0 | 163.0 |
| 596.0 | 708.0 | 19.0 | 92.0 |
| 596.0 | 791.0 | 40.0 | 260.0 |
| 596.0 | 887.0 | 40.0 | 260.0 |
| 486.0 | 943.0 | 47.0 | 451.0 |
| 486.0 | 1000.0 | 340.0 | 451.0 |
| 900.0 | 1013.0 | 20.0 | 20.0 |
| 869.0 | 1074.0 | 20.0 | 20.0 |
| 874.0 | 1120.0 | 20.0 | 20.0 |

The model seems to work well on the data it has been trained on. Since Ebay is also in the training set, I figure there is a problem with how I extract the bounding boxes of new webpages. Here an example of the model on data in the training set:

@kevalmorabia97 Could you post the code you used to generate the bounding boxes?