the-algorithm icon indicating copy to clipboard operation
the-algorithm copied to clipboard

RFC (repost): Priority of algorithm usage across different X features

Open twister21 opened this issue 1 year ago • 0 comments

tl;dr: fewer algorithms for home feed generation, more algorithms for content & author discovery across the core features Topics, Trends, Lists and Search

Introduction

X platforms (including Twitter, a text-focused service) shouldn't serve a “For You” home feed by default that includes any embedding space (e.g., subscribable creators, tweets or videos “based on your likes”) or a large amount of social graph content, apart from likes and reposts if this hasn't been explicitly disabled, because it isn't suited for users who follow a wide thematic variety of content, and generic algorithms can't outperform human beings in picking which kind of posts and accounts best fit in one's feeds, but the current system lacks customization and discovery features (including a more descriptive and comprehensive classification scheme), and needs improvement.

Extending curation options and enabling platform interoperability (via shared following lists and histories as well as superfeeds and supersearch) is more reasonable than cloning complementary platforms, such as YouTube, Substack/Medium, LinkedIn or Reddit/Discord (https://github.com/orgs/community/discussions/60763).

Overview

Introduce new sorting and filter options for (now multiple) following-only home feeds (see Feed restructuring) and provide separate methods to discover follow-worthy accounts and notable posts through better use of a (by default not language-restricted if auto-translation is also active) user's

  • topic feed, which should be fully viewable via /topics and individually via /t (human-readable identifiers are more useful than generic (such as snowflake) topic ids for direct URL access), also show topic-based|related followables on the sidebar
  • recommendations, which should additionally be filterable by (time-based) sources used for generation
  • seen and search history, which should be accessible via /history, and pausable, searchable, selectively deletable (individually, multi-selectable and/or by time period - also automatically) as well as clearable

as well as

  • general data record similarity (such as Twitter's SimClusters system for accounts), filters and sorting
  • account summaries
    • highlight major attributes (topics, formats) of posts (with percentage distribution), also time-sensitive analytics
    • allow following (account, post attribute) pairs
    • auto-generate lists (by account type) based on affiliates
    • link connected accounts of other platforms
  • search
    • autocomplete and trends (different from posts)
    • semantic, multi-language capabitilies
    • quick access for advanced search
    • additional/separated fields (topic: name; account: biography; video: description, captions; Space: title, captions)
    • align related embedding space section contents (such as "people also viewed", "from releated searches", "people also search for") horizontally and always show them
    • specifically YouTube: remove unrelated sections from the results (such as "previously watched", "explore more" or "popular today"), reduce number of "Shorts" sections (with inline navigation) to 1 in mixed feed, and improve keyword-based results quality (e.g., the query "youtube" yields lots of results not containing the keyword)
    • and Discord: global server/community post search

Data feeds

Feed lists (or multi-column grids in wide mode) and partly direct messages should have, in addition to custom filters, persistable including/excluding Standard sorting & filter criteria, which should also be applicable to a data record's parent (followable|post), cutomizable auto-refresh for live feeds (if enabled: based on backlog size or time period) and automatically present machine translations of languages not declared as supported (which can be configured, specifically the primary/target language, or inferred from an user's own use), unless set to manual mode or translation is already provided by the original author (multi-language post). For speech audio content (including live), auto-generated language-specific tracks should have a natural, similar sounding voice and high-quality, accurate subtitles, as well as a separate transcript on the related text-based platform, if not disabled. For text, this include inline and graphically embedded (either via graphic-to-inline/subtitle or graphic-regeneration).

Followables (account|topic|community|list) feeds should have a multi-follow feature.

Specific feeds (such as replies, tag and individual account) should show post attribute distribution statistics, and be filterable by percentage.

Basic post type can be, for example: Twitter: tweet, thread, space YouTube: (live|recorded-live|music) video, movie, short, (podcast|show) episode, audiobook, clip (not necessarily created via the so-called feature, so excerpts can be longer than 60s) Medium: (fact-based) article Substack: (subjective) story LinkedIn: job all: reply, note

Differentiation doesn't make sense if content gets crossposted, either by the same user or others, which should neither be necessary nor possible (improve/implement single- and multiplatform duplication prevention). Instead, in addition to each platform's type-specific feeds, there should be a filterable super(home|topics|explore|account|...) feed as well as a supersearch feature and supercreation menu (via x.com), redirecting to specific interfaces as necessary. Similar types across platforms should be treated as identical or be concentrated/moved, e.g., Twitter: tweet; Substack: note; YouTube: post.

List can be, for example: Post list: YouTube playlist, Medium list/publication, Twitter bookmark folder Account list: Twitter list

Activity type:

  • seen: impression, view (see more, post click)
  • engagement: reply, quote, repost, save (into list) -- see later/bookmark, like
  • other interaction (link, hashtag, profile click)

Data record visibility type: public, protected (approved followers-only), paid (buyable|subscription), unlisted, private

Differentiate seen count/content between impressions (served in feed/scrolled-by) and (unique account/IP) views (for short-form text content, this doesn't necessarily require interaction, so either there should also be unique impressions or the term is used differently for different content). Other individual data record and platform metrics (number of total, newly created data records and user activity by type attributes and time period) should also be public.

Hierarchical classification attributes are automatically assigned to data records, e.g., by topic-social-proof-service, whose semantic understanding currently is both inconsistent and in parts fundementally faulty (leading to inclusion of offtopic content), while tags can be manually assigned to posts, but need to be verified, at the latest when being served under the tag.

Communities and lists can be edited and assigned human-readable identifiers by permitted accounts. Lists should be forkable, or data records be submittable.

Feed restructuring Home

  • "For you" should be replaceable by a "Most activity" tab, serving _Standard sorting & filter criteria respecting top posts from the account following list.
  • Accordingly, "Following" is renamed to "Latest", or introduced if it doesn't exist yet

Topics

  • Once all content of the home feed has been seen, the custom topics feed should appear
  • Posts are listed either grouped by topic or individually (with topic link) based on sorting preference

Explore (Twitter's Explore page currently only shows trends, tweets from 0-5? topics, either from the topic following list or some other source, also depending on selected display language, and unfilterable recommendations)

  • should be enhanced or introduced (see below, Explore feed)

This way, users can remove the recommendations from their own timeline and not miss out on highly relevant content without losing the ability to discover novel content. In practice, this can be previewed on a home feed sidebar card titled "Explore more" or "More activity" (instead of "who to follow"), above/below the trends card, if they haven't been hidden.

Explore feed Should consist of automatically curated content:

  • trends and the links of the topics they belong to
  • the entirety of data records, except for topics grid, groupable by Standard filter options datetime ranges (expandable preview sections), which should also be filterable by geo-location popularity
  • recommended data records based on social graph layers (e.g., 1st: "followed or viewed by accounts you follow") or related/similar to following list as well as posting, engagement type, seen or search history

These feeds should be seperately viewable (via /trends, /all or /popular, /recommended), and additionally filterable by data record's type.

Implement new account-level and platform-wide graph and bubble visualization feeds.

Trends feed (the term is used differently, in this context: popular tags or external links based on created or seen posts in a specific time period (how Twitter uses it), alternatively: individual popular posts (how e.g., YouTube uses it), ambiguity should be removed by using the same name for the same feature)

  • improve grouping ("Trending with" currently only works in some cases)
  • quick switch for regional, personal ("For you") and (custom) topic-filtered trends, also selected timespan/historical and international trends (requires translation)
  • add option to preview top post(s) for each trend
  • map events to hashtags, show additional information if available
  • summary, some trends don't have a tag?

Following Lists feed

  • contents should be viewable in one feed, so that filters can be applied

Replies feed

  • replies should additionally be filterable/groupable by
    • reactions-only (e.g., emojis, punctuation marks, interjections)
    • recurring tags and (auto-titled) context (replies that refer to the same thing)
    • contains timestamp
  • quote posts should be treated as replies
  • reply restrictions for accounts (mentioned, following, verified) should be reader-configurable filters instead, and the options to hide/delete or pause/disable replies become needless

This greatly improves the experience, and the original "Community Notes" are replies, curatable in a meaningful way.

Spam, (offtopic) and impermissibly automated account filters

  • prevent abuse by improving semantic filtering -> false positive (spam) and false negative (bots) rates are currently too high
  • automatically applied to individual (hash-)tag, particularly incorrectly labeled content, trends and replies feed as well as direct messages

Text editor

  • show corrective orthography (spelling, punctuation, capitalization), grammar and semantic (consistency, brevity) suggestions by default

Discover more

  • same author
  • closest similarity
  • other post attributes (topic, format, sentiment)

Standard sorting & filter criteria

both:

  • activity (latest post/reply, frequency)
  • post:
    • average view completion score, in/excluding skipping; also replies
    • unseen/view completion
    • content length (word count; video: duration; text: estimated reading time)
    • language: precision/accuracy (specificity), information density (meaningful information per unit of data) and granularity (level of detail), cohesion (linguistic unity) and coherence (logical/semantic flow) scores

sortable by:

  • datetime of creation (latest|oldest)
  • popularity (highest|lowest): number of
    • post: engagements, seens; engagement or interaction/seens ratio (highest engagement/interaction rate)
    • followable: followers, combined post metrics
      • account: paid followers, affiliates
      • post list: views
      • topic|community (= members): posts, authors
  • default (mixed setting): prioritizing recent (< 48h or custom value) popularity
  • author diversity

filterable/groupable by (optional):

  • datetime range (of creation or combined with other fields, like popularity) (last 1h|24h|72h|7d|30d|6m|1y|10y; current or selected year/month/week/day; all-time; custom timespan)
  • geotag, if available
  • attributes inherited by children posts (like topics and formats)
  • account: type (personal|organizational|automated); professional category; PCF (parody, commentary, fan), premium (verified, ID-verified), (inter-)governmental, affiliated (also filterable by specifc org), paid-followable (e.g., Twitter: subscribable, YouTube: joinable) (true|false)
  • post (attributes need refinement):
    • basic type
    • mode: fiction, non-fiction, poetry, interactive/performative, mixed
    • (main) purpose: e.g., instructive, informative (explanatory), entertaining, expressive/reflective, critical, aesthetic
    • formats: e.g., argumentative|narrative|descriptive essay, interview, speech, summary, news report, review, analysis, note, meme, poem, narrative, reportage, expository, observational, tutorial, lecture
    • genres: e.g., comedy, satire, educational, documentary, journalistic
    • perspective type: objective, subjective, mixed
    • topic(s)
    • (author/intrinsic) sentiments (positive|neutra|negative|opposed|ambivalent)
    • styles (formal: e.g., academic/scientifc, corporate, bureaucratic/legal, not exclusive: journalistic, political|informal: e.g., poetic, slang, dialect, casual, colloquial, simple|profane/vulgar: e.g., swearing/cursing, insulting, discriminatory|illicit: e.g., (uncorrected) defamation of legal entities; doxing; threatening, inciting/instructing/praising crimes, such as violence, or self-harm (invisible and non-filterable, except by affected/targeted or authorized users))
    • tones (objective|neutral|analytical|robotic|enthusiastic|friendly|sympathetic|diplomatic|passionate|aggressive|angry|annoyed|confident|humorous|ironic|polemic|persuasive|urgent|surprised|apologetic|sad|skeptical|condescending)
    • hasNoteAttached (true|false)
    • original or manually dubbed languages (listed individually or via a user's supported languages)
    • content types (language (inline text, graphical text, audio): factual/opinion statement, quote, question, social expression, poll, statistic, external link, rich text/markdown (e.g., bold, italic, codeblock), (La)TeX; graphic (image/video): sensitive/age-restricted: violence, physical injury, nudity, psychotropics, medical; synthetic)
    • mentioned accounts/people
    • only for specific feeds: engagement type

twister21 avatar Jan 23 '25 21:01 twister21