python-goose
python-goose copied to clipboard
Published_Date extraction
Hello,
I would like to extract published date from a news website but goose gives me Null. The object containing the date is: 03 juillet 2015 à 09h19
I took a look at the source and I found that it used this list to find Publish date: KNOWN_PUBLISH_DATE_TAGS = [ {'attribute': 'property', 'value': 'rnews:datePublished', 'content': 'content'}, {'attribute': 'property', 'value': 'article:published_time', 'content': 'content'}, {'attribute': 'name', 'value': 'OriginalPublicationDate', 'content': 'content'}, {'attribute': 'itemprop', 'value': 'datePublished', 'content': 'datetime'}, ]
Do you know how can I change It to use it in my case?
Thanks for help