Hal Roberts

Results 22 issues of Hal Roberts

30 seconds for the biggest topics.

priority

youtube embed videos have not date in them, so they get assigned the current date or the article linking date. we should transform those urls into standard youtube/watch urls so...

enhancement

We are getting lots of download errors from malformed urls. Some are fixable with more aggressive url fixing of things like '>http://foo.bar'. Mostly the problem is feeds giving us relative...

The run_remotely() call to extract_and_vector from topics-fetch-link was mysteriously hanging occasionally, eventually causing the fetch link queue to shrink. I changed the run_remotely() call to an add_to_queue() call followed by...

solr imports are very slow. here are the last ten imports and the size of the import queue: ``` mediacloud=# select * from solr_imports order by solr_imports_id desc limit 10;...

add option to create twitter topic using archive.org twitter search instead of crimson hexagon search. compare existing ch-based us 2018 topic with archive.org based topic.

The current story title deduping system in the topic spider creates a giant table of all the parts of all story titles in a given topic in a given media...

the python default 're' module does not recognize words correctly in hindi. we should just replace 're' with 'regex' everywhere for easy consistency so this doesn't bite us again. note...

set_media_tag is crashing with this: ``` 2018-08-27 02:21:18,451 MediaWords.DBI.Media.PrimaryLanguage: detect primary language for motisagron.wordpress.com [59652] ... 2018-08-27 02:21:18,460 MediaWords.DBI.Media.PrimaryLanguage: detect primary language for motisagron.wordpress.com [59652] update to he Traceback (most...

the snapshots_id is correct, but the timespans_id is wrong in the 'Version ##' links on the versions page. all of the timespans_id= values are for the timespans_id for the latest...

bug
topics app