python-goose icon indicating copy to clipboard operation
python-goose copied to clipboard

Published_Date extraction

Open kmehl opened this issue 9 years ago • 0 comments

Hello,

I would like to extract published date from a news website but goose gives me Null. The object containing the date is:

I took a look at the source and I found that it used this list to find Publish date: KNOWN_PUBLISH_DATE_TAGS = [ {'attribute': 'property', 'value': 'rnews:datePublished', 'content': 'content'}, {'attribute': 'property', 'value': 'article:published_time', 'content': 'content'}, {'attribute': 'name', 'value': 'OriginalPublicationDate', 'content': 'content'}, {'attribute': 'itemprop', 'value': 'datePublished', 'content': 'datetime'}, ]

Do you know how can I change It to use it in my case?

Thanks for help

kmehl avatar Jul 06 '15 15:07 kmehl