4cat icon indicating copy to clipboard operation
4cat copied to clipboard

The 4CAT Capture and Analysis Toolkit provides modular data capture & analysis for a variety of social media platforms.

Results 133 4cat issues
Sort by recently updated
recently updated
newest added

**Describe the bug** Discovered in a particular dataset whose `map_item` returned a `{}` on the first item. This causes a number of issues in itself, but seemed to first strike...

Some languages, in particular East-Asian ones, don't (just) use spaces to separate words, so the standard NLTK tokeniser doesn't work for them. It is likely that there are many languages...

enhancement
processors

Would you consider adding a feature that would enable users to use custom stopword lists (ie. in different languages) with tools that use them? Thanks

enhancement
processors

**Describe the bug** I have seen this before but cannot recall how it was "fixed". I may have something to do with markdown versions. I see that we have both...

wonder if it has been previously discussed to have human readable dataset metadata in data export filenames? e.g. `/result/attribute-frequencies-hashtags-1000-overall-[ID].csv` rather than `/result/attribute-frequencies-[ID].csv`? i guess this might be non-trivial to implement...

The 'About' page should probably refer to documentation and guides etc rather than the 'news' thing it's doing now, and the FAQ is still very 4chan-oriented.

enhancement
(mostly) front-end

A data source, for [LIHKG](https://lihkg.com/). Uses the web interface's web API, which seems reasonable straightforward and stable. There is some rate limiting, which 4CAT tries to respect by pacing requests...

enhancement
data source
questionable

This is potentially possible with the "From URL" feature.

enhancement
(mostly) front-end

a processor that can filter on multiple particular words or phrases within a dataset, and outputs the count values (overall, or over time) per item, outputting a .csv that can...

processors
data source