Speech accents
Summary
Accent system is used to modify speech before it is sent to chat to simulate speech defects or status effects. Text replacement rules are defined using special format.
Motivation
While it is possible to type any accent manually, it is handy to have some automatic system. Additionally accents can act as limitations like vision, hearing and other impairments.
Custom format should simplify accent creation by focusing on rules.
The result of this should at least have feature parity with Unitystation accents, otherwise it is not worth the effort.
Guide-level explanation
Accents modify player speech in chat. Multiple accents can be applied on top of each other, making message much less comprehensible.
Accents can be acquired in multiple ways: selected accent(s) during character creation, wearing items items (clown mask), status effects (alcohol consumption, low health) and maybe others.
Replacements are found in multiple passes. Each pass inside accent
has a name and consists of multiple rules which are combined into a
single regex. A rule says what to replace with what tag. Simplest
example of rule is: replace hello with Literal("bonjour").
Literal is one of the tags, it replaces original with given string.
Note that hello is actually a regex pattern, more complex things can
be matched.
Some of the tags are:
- Original: does not replace (leaves original match as is)
- Literal: puts given string
- Any: selects random inner replacement with equal weights
- Upper: converts inner result to uppercase
- Lower: converts inner result to lowercase
- Concat: runs left and right inner tags and adds them together
Some tags take others as an argument. For example, Upper:
Upper(Literal("bonjour")) will result in hello being replaced with
BONJOUR.
It is possible to define multiple intenisty levels of accent in the
same file. You can make accent get progressively worse as intensity goes
higher. Intensity can be either randomly assigned or get worse as effect
progresses (you get more drunk).
Ron example:
// This accent adds honks at the end of your messages (regex anchor $)
// On intencity 1+ it adds more honks and UPPERCASES EVERYTHING YOU SAY
(
accent: {
// `ending` pass. all regexes inside pass are merged. make sure to avoid overlaps
"ending": (
rules: {
// 1 or 2 honks on default intensity of 0
"$": {"Any": [
{"Literal": " HONK!"},
{"Literal": " HONK HONK!"},
]},
},
),
},
intensities: {
1: Extend({
// merges with `ending` pass from accent body (intensity 0 implicitly)
"ending": (
rules: {
// overwrite "$" to be 2 to 3 honks
"$": {"Any": [
{"Literal": " HONK HONK!"},
{"Literal": " HONK HONK HONK!"},
]}),
},
),
// gets placed at the end as new pass because `main` did not exist previously
"main": (
rules: {
// uppercase everything you say
".+": {"Upper": {"Original": ()}}),
},
),
}),
},
)
Reference-level explanation
General structure
Accent consists of 2 parts:
accent: intensity 0intensities: a map from level to enum ofExtendorReplace, containing intensity definition inside, same asaccent
Accent is executed from top to bottom sequentially.
Regex patterns
Every pattern is compiled into regex meaning it has to be valid rust regex syntax. While some features are missing, regex crate provides excellent linear performance.
By default every regex is compiled with (?mi) flags (can be opted out by
writing (?-m).
Regexes inside each pass are merged which significantly improves perfomance (~54x improvement for scotsman with 600+ rules) but does not handle overlaps. If you have overlapping regexes, those must be placed into separate passes.
Case mimicking
Messages look much better if you copy original letter case. If user was SCREAMING, you want your replacement to scream as well. If use Capitalized something, you ideally want to preserve that. Best effort case mimicking is enabled for literal. This currently includes:
- do nothing if input is full lowercase
- if input is all uppercase, convert output to full uppercase
- if input and output have same lengths, copy case for each letter
This is currently ASCII only!!
Regex templating
Regex provides a powerful templating feature for free. It allows
capturing parts of regex into named or numbered groups and reusing them
as parts of replacement.
For example, Original is Literal("$0") where $0 expands to entire
regex match.
Tag trait
There are multiple default tags but when they are not enough, Tag can be implemented which would automatically allow deserializing implementation name. Implementation of Tag could look like this (not final):
use sayit::{
Accent,
Match,
Tag,
};
// Deserialize is only required with `deserialize` crate feature
#[derive(Clone, Debug, serde::Deserialize)]
// transparent allows using `true` directly instead of `(true)`
#[serde(transparent)]
pub struct StringCase(bool);
// `typetag` is only required with `deserialize` crate feature
#[typetag::deserialize]
impl Tag for StringCase {
fn generate<'a>(&self, m: &Match<'a>) -> std::borrow::Cow<'a, str> {
if self.0 {
m.get_match().to_uppercase()
} else {
m.get_match().to_lowercase()
}.into()
}
}
// construct accent that will uppercase all instances of "a" and lowercase all "b"
let accent = ron::from_str::<Accent>(
r#"
(
accent: {
"main": (
rules: {
"a": {"StringCase": true},
"b": {"StringCase": false},
}
),
}
)
"#,
)
.expect("accent did not parse");
assert_eq!(accent.say_it("abab ABAB Hello", 0), "AbAb AbAb Hello");
Intensities
Default intensity is 0 and it is always present in accent. Higher
intensities can be declared in optional intensities top level struct.
Key is intensity. This map is sparse meaning you can skip levels.
The highest possible level is selected.
There is 2 ways to define intensity:
Replace starts from scratch and only has its own set of rules.
Extend recursively looks at lower intensities up to 0 and merges them
together. If pattern conflicts with existing pattern on lower level it
is replaced (its relative position remains the same). All new rules are
added at the end of merged words and patterns arrays.
Drawbacks
Accent system as a whole
Some people might find accents annoying. Impacts server performance by ~0.0001%
Tag system perfomance
This is mostly mitigated by merging regexes.
~~List of regular expressions will never be as performant as static~~ ~~replacements. There are some potential optimizations like merging~~ ~~patterns without any regex escape codes or some smart way to run~~ ~~replacements in parallel, but list of static strings can be~~ ~~replaced efficiently.~~
~~Other aspect of tag system is layers which add some overhead unless~~ ~~compiled down but even then some tags might need nesting.~~
~~While these can be partially mitigated, it would increase code~~ ~~complexity significantly.~~
Memory footprint
Compiled regexes are pretty large. Scotsman accent alone in CLI tool on release build shows up as ~130mb. Although not sure i measured it correctly.
Executable size / extra dependencies
Library was made as minimal as possible with 37 dependencies and ~1.1M .rlib size. Further size decrease is possible by disabling regex optimizations.
~~Due to complexity of deserializable trait and dependency on regex there~~ ~~are ~40 total dependencies in current WIP implementation and .rlib~~ ~~release file is ~1.2M (unsure if it's correct way to measure binary~~ ~~size).~~
Regex rule overlaps
This has been solved by regex passes.
~~It is harder (or maybe even impossible) to detect overlaps between regex~~ ~~patterns as opposed to static strings. Users must be careful to not~~ ~~overwrite other rules.~~
Patterns overwrite words
This has been solved by regex passes.
~~This problem is essentially the same as previous one. Rules are executed~~ ~~top to bottom, words first and then patterns. It makes it hard or in~~ ~~some cases even impossible to adequately combine words and single/double~~ ~~character replacements.~~
Extreme verbosity
Even simplest tags like {"Literal": "..."} are extremely verbose. Ideally i would want
to deserialize String -> Literal, Vec<Box<dyn Tag>> -> Any, Map<u64, Box<dyn Tag>> -> Weights
but i did not find a way to do this yet. Not sure if it is possible.
Additionally there is a lot of nesting. I tried my best to keep accent as flat as possible but there is simply too much going on.
Rationale and alternatives
Accent system as a whole
Alternative to not having accents is typing everything by hand all the time and hoping players roleplay status effects.
Tag system
As for tag system, it potentially allows expressing very complex patterns including arbitrary code via Custom tag impls that could in theory even make http request or run LLM (lets not do that).
While being powerful and extensible, tag syntax remains readable.
Regex patterns
While being slower than static strings, regex is a powerful tool that can simplify many accents.
Prior art
Other games
SS13
As far as I know, byond stations usually use json files with rules.
This works but has limitations.
Unitystation
Unitystation uses some proprietary Unity yaml asset format which they use to define lists of replacements - words and patterns. After all replacements custom code optionally runs.
Accent code: https://github.com/unitystation/unitystation/blob/be67b387b503f57c540b3311028ca4bf965dbfb0/UnityProject/Assets/Scripts/ScriptableObjects/SpeechModifier.cs
Folder with accents (see .asset files): https://github.com/unitystation/unitystation/tree/develop/UnityProject/Assets/ScriptableObjects/Speech
This is same system as byond and it has limitations.
SS14
Space Station 14 does not have any format. They define all accents with pure c#.
Spanish accent: https://github.com/space-wizards/space-station-14/blob/effcc5d8277cd28f9739359e50fc268ada8f4ea6/Content.Server/Speech/EntitySystems/SpanishAccentSystem.cs#L5
This is simplest to implement but results in repetitive code and is harder to read. This code is also hard to keep uniform across different accents.
There is a helper method that handles world replacements with localization and case mimicking: https://github.com/space-wizards/space-station-14/blob/a0d159bac69169434a38500b386476c7affccf3d/Content.Server/Speech/EntitySystems/ReplacementAccentSystem.cs
Similar behaviour might be possible with custom Tag implementation that looks up localized string at creation time and seeds internal Literal with it.
Unresolved questions
- ~~Tag trait!!!~~
- ~~How to integrate this with SSNT~~
- ~~Custom trait options/message passing/generic over settings - likely impossible~~
- Do benefits of tag system overweight the complexity that comes with it
- ~~Minimal set of replacement tags~~
- ~~Maybe a way to completely redefine accent / extend it like default~~ ~~Unitystation behaviour where custom code runs after all rules~~ this is likely covered by passes/custom Tag implementations
- ~~How complex should be string case mimicking~~
- The optimal way to do repetitions
- Reusing data: you might want to add 2 items to array of 1000 words in next intensity level or use said array between multiple rules
- ~~Do tags need to have access to some state/context~~ not now
Future possibilities
Accent system could possibly be reused for speech jumbling system: turning speech into junk for non-speakers. One (bad) example might be robot communications visible as ones and zeros for humans.
I am currently working on proof of concept for this tag system at https://git.based.computer/fogapod/sayit