Screen2Vec
Screen2Vec copied to clipboard
Do you support using screen image as the `screen` parameter?
Hello @tobyli, we're trying to use the pre-trained model to get the vectors, following your instruction under Quick Start.
Regarding -s/--screen, the path to the screen to encode
, can you explain more about what this screen
parameter should be? Looks like it should be a JSON file that contains the UI layout? If so, do you support using screenshot images directly? Thank you very much!
Thanks for the question, Yixue! That option takes the JSON hierarchical representation of a screen in the format of screens in the RICO dataset (https://interactionmining.org/rico). We don't support screenshot images as our model doesn't really use (pixel-based) visual information from the screens.
Got it! Thanks for confirming this @tobyli. :) BTW, have you ever tried using the UI hierarchy reverse-engineered from the screenshots (e.g., using REMAUI, UIED)? since sometimes the UI hierarchy code isn't available (like during the mock-up phase etc) Wondering if you have any insights on how well that might work (i.e., how "good" the output vectors are if using the UI hierarchy obtained from reverse engineering tools). And just to clarify, I'm only talking about the testing phase to get the vectors using the pre-trained model, not the training phase (it does make a lot of sense to use RICO's data for training).
I haven't tried it, but it sounds like an intriguing idea! I think as long as the reverse-engineering generates reasonable meta-data for each view (e.g., text, className) as well as reasonable hierarchical structures, it should work without a problem. Let me know if you decided to try it that way -- really curious about the result.
sure will keep you posted if we end up trying this route :)