Tristan Lee
Tristan Lee
OS: Ubuntu 20.04 Python: 3.8.10 snscrape: 0.4.3.20220106 The command `snscrape vkontakte-user rus_imperia` fails after 7868 posts, because the url `'https://vk.com/wall-35251260_28090?reply=28125&thread=28091'` doesn't satisfy the `url.rsplit('_', 1)[1].strip('0123456789') in ('', '?reply=')` clause of...
There's currently one analysis included in this tool: aggregating hashtags common to a specified hashtag. There are likely other useful analyses that researchers and journalists are interested in, including -...
It's unfortunate that the current implementation relies on a non-Python TikTok API wrapper (https://github.com/drawrowfly/tiktok-scraper implemented in Node), which makes installation and dependency management more difficult. There may be native Python...
For hashtags containing emojis (such as #ukraine🇺🇦), the plot generated by `hashtag_frequencies.plot` does not display the emoji, as shown in the figure below.  The problem seems to be that...
Make it easy for a user to load Telegram chat exports from the Telegram Desktop app, both in JSON and HTML format.
- multi-line links - links with emoji text (offset by 1-2 characters due to surrogates) Concern about directly merging, because re-running transformer would (?) require dropping media table?
- Modified langdetect detect method to decrease run-time - Fixed indentation error in `transform_info` - Prototyped removal of offset in `transform_all_untransformed` - This change needs modifications: it fails for the...
On tests, I'm getting the warning: `/home/work/.venv/cisticola/lib/python3.9/site-packages/spacy/pipeline/lemmatizer.py:211: UserWarning: [W108] The rule-based lemmatizer did not find POS annotation for one or more tokens. Check that your pipeline includes components that assign...