webstruct
webstruct copied to clipboard
Pretrained Models
Webstruct looks like a really cool extension to have for any scraping enthusiast, so thank you for creating this! It would be really awesome if you guys could also release some pre-trained models along with this library. It's not feasible for every user to have loads of annotated data and what people generally are looking for are the most common entities (NAME, PLACE, ORGANISATION, etc). A humble suggestion :smile:
what do you mean by models?
Instead of having to annotate and train on that data, can we simply load a configuration/parameter file (model) instead and test new data against it? A prebuilt NER engine, that's what I meant from a trained model.
On Mon, 13 Feb 2017, 10:10 p.m. Manuel Garrido, [email protected] wrote:
what do you mean by models?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/scrapinghub/webstruct/issues/37#issuecomment-279446912, or mute the thread https://github.com/notifications/unsubscribe-auth/AFem9ePka3c9kEp97mbnaBN0TJ_g5COtks5rcIdrgaJpZM4L_EzI .
It's still actual question
Is a generic pretrained model available? A model that has already been trained on sufficient annotated HTML data, and can be used quickly without requiring any training.