Screenshot-to-code icon indicating copy to clipboard operation
Screenshot-to-code copied to clipboard

creating dataset

Open Kotresh17 opened this issue 6 years ago • 4 comments

Hi, many thanks for sharing the data and code. how can we take it forward, how can we generate more data apart from synthesised data. can we create same kind of dataset for real time html page. if so, then how can we generate .gui files for that. if you have any resource or any thoughts please do share us.

Kotresh17 avatar May 20 '19 15:05 Kotresh17

Hi Kotresh,

For the bootstrap version you could write a script that takes screenshots of existing bootstrap website templates and build a DSL vocabulary vocabulary based off that. It should be pretty straightforward with the structure looking like the pix2code datasets and DSL

So for example a website that looks like this: https://imgur.com/a/IF3NxTV

Would have a .gui that looks something like below:

header{
    navigation-top{
        logo,
        menu-right{
            menu-link-active,
            menu-link,
            menu-link,
            menu-link
        }
    }
}
main-heading,
row{
    col-3{
       link{
            image
        }
    }
    col-3{
       link{
            image
        }
    }
    col-3{
       link{
            image
        }
    }
footer{
    row-centered{
       text
    }
}

For the HTML version, quoting the issue from Emil:

https://github.com/emilwallner/Screenshot-to-code/issues/20

“As mentioned in the article, the HTML version does not generalize on new images. The Bootstrap version generalizes on new images but with a capped vocabulary. The evaluation images for the bootstrap version are under /data/eval/ . You can test it here: floydhub/Bootstrap/test_model_accuracy.ipynb

If you want to train it to generalize on a more advanced vocabulary, I'd recommend customizing it to work on the HTML set provided here: https://github.com/harvardnlp/im2markup (on floydhub: --data emilwallner/datasets/100k-html:data)

After that, I'd recommend creating a new dataset. Create a script that generates random websites, say starting with newsletters or blog layouts. Then you can add optical character recognition, fonts, colors and div sizes as you go.

If you build a version for the harvardnlp dataset or a script that generates websites, please make a pull request.”

PaulGwamanda avatar May 27 '19 08:05 PaulGwamanda

Hi, thanks for sharing the data and code. Can you please tell how to create .npz and corresponding .gui files for our custom images. if you have any thoughts please do share us, it will be really helpful for us to proceed. for example: i have attached basic form image, can you please share your thoughts how to convert this image to .npz and .gui form to train with the model so that i can get the html code for similar images. screentocode

yuvarajvc avatar Sep 05 '19 18:09 yuvarajvc

hi I'm pretty late to this but I was just wondering what is a .gui file and how do you open it ? thankyou

salmanahmad10 avatar Nov 04 '20 20:11 salmanahmad10

@yuvarajvc: You can convert an image to a compressed .npz file using my script here: https://gist.github.com/PaulGwamanda/f91ce9fc9d392c4bcc99c085fd726a34

@salmanahmad10: Any code editor can view and edit a .gui file.

The .gui name extension convention was used by the original paper (Pix2code) and has no special relevance. The project uses the .gui file to map the corresponding token sequence relationship to it's image pair which has the same name.

ie. image1.png (or .npz when compressed) should have a corresponding .gui file called image1.gui which has it's textual token features representing the description of the image

PS I'm pushing my dev toolkit here which includes 100 *samples and will be happy to sell my whole dataset. Email me at [email protected]

PaulGwamanda avatar Jan 05 '21 17:01 PaulGwamanda