Hal Roberts

Results 22 issues of Hal Roberts

This is the migration of the topics-mine and topics-mine-public workers from perl to python. This is almost entirely just a line by line migration, with only a few small fixes...

The solr index seems to have duplicate documents in it. Those duplicates are filtered out when document lists are returned but are present when results are merely counted. For the...

All of our crawler_fetcher and fetch_link workers were clocking on create_missing_partitions(). create_missing_partitions was blocked on an autovacuum of the stories table. I'm not sure how to fix this long term....

We should replace smart quotes and long dashes in solr queries at the api level.

add topics/cancel end point to end any currently running spider or snapshot jobs for a topic.

It looks like the tags_id partition is not getting created on new stories_tags_map partitions. I have started a manual script to create the missing indexes, but I think the underlying...

We just renewed access to the associated press feed, but they are no longer giving access to their old, rss based feed. Instead, we have to ingest their custom api....

priority

We are currently running feed scraping code that is a few years old. Feed scraping for us just means discovering the set of rss feeds that cover all syndicated stories...

enhancement