webstruct icon indicating copy to clipboard operation
webstruct copied to clipboard

Pretrained Models

Open NightFury13 opened this issue 7 years ago • 4 comments

Webstruct looks like a really cool extension to have for any scraping enthusiast, so thank you for creating this! It would be really awesome if you guys could also release some pre-trained models along with this library. It's not feasible for every user to have loads of annotated data and what people generally are looking for are the most common entities (NAME, PLACE, ORGANISATION, etc). A humble suggestion :smile:

NightFury13 avatar Feb 13 '17 11:02 NightFury13

what do you mean by models?

manugarri avatar Feb 13 '17 16:02 manugarri

Instead of having to annotate and train on that data, can we simply load a configuration/parameter file (model) instead and test new data against it? A prebuilt NER engine, that's what I meant from a trained model.

On Mon, 13 Feb 2017, 10:10 p.m. Manuel Garrido, [email protected] wrote:

what do you mean by models?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/scrapinghub/webstruct/issues/37#issuecomment-279446912, or mute the thread https://github.com/notifications/unsubscribe-auth/AFem9ePka3c9kEp97mbnaBN0TJ_g5COtks5rcIdrgaJpZM4L_EzI .

NightFury13 avatar Feb 13 '17 17:02 NightFury13

It's still actual question

rmotsar avatar May 31 '18 09:05 rmotsar

Is a generic pretrained model available? A model that has already been trained on sufficient annotated HTML data, and can be used quickly without requiring any training.

HAMZA310 avatar Jun 08 '21 12:06 HAMZA310