backend icon indicating copy to clipboard operation
backend copied to clipboard

Media Cloud is an open source, open data platform that allows researchers to answer quantitative questions about the content of online media.

Results 107 backend issues
Sort by recently updated
recently updated
newest added

On the Covid-sourdough topic (4138) we are using for the upcoming ICSWM tutorial I note [that Instagram is the most linked to source](https://topics-dev.mediacloud.org/#/topics/4138/media?focusId=&q=&snapshotId=5100&timespanId=975868). However, [that source (104129) is a weird...

data-quality

To make some technical decisions I think we need to more concretely design the primary use cases we have in mind so far. Here's my stab at a list, and...

we can use tensorflow, keras layers and a pre-trained model (like ResNet or MobileNet) to classify story images. We can adapt pre-existing models or create our own - it depends...

I'm seeing a that stories from the LA Times have the content of the story repeated multiple times in the same story object in our database. This is a data...

data-quality

A user found a story at one point, but when returning to their query later couldn't find the same one, so they emailed us. The story id in question is...

data-quality

All of our crawler_fetcher and fetch_link workers were clocking on create_missing_partitions(). create_missing_partitions was blocked on an autovacuum of the stories table. I'm not sure how to fix this long term....

Like the other topic discovery plugins, we need to add a plugin for ingesting matching YouTube videos into a topic, extracting links from the description and/or comments, and saving it...

enhancement

We've decided that we can retrieve a useful set of content from CrowdTangle, so we need to add a plugin that lets us discover and ingest content via their API...

enhancement

We've come up with a short term idea for shifting the sitemap ingest process to researchers. The idea is that web-users could request a source's sitemaps be fetched (via [ultimate-sitemap-parser](https://github.com/berkmancenter/mediacloud-ultimate-sitemap-parser)),...

enhancement
api

We should replace smart quotes and long dashes in solr queries at the api level.