Word predicion and correcting
Is there a possibility to have a predictive and corrective row on the keyboard or is there no chance for such an option?
The keyboard really lacks this feature but unfortunately, it's a huge amount of work and I don't plan to do it this year. Of course, contributions are welcome.
This keyboard is the best one ive tried so far for users who need precise control, this feature would make et perfect for everyday use. FlorisBoard is currently implementing this feature in their v4 branch, maybe we can get some inspiration? Im surprised there is no library to modularize this common need of all mobile keyboards.
/EDIT: Also OpenBoard already implements words correction and prediction and is the only opensource keyboard to my knowledge with this feature working reasonably well for production use, although much less effective than ai based prediction such as SwiftKey.
Ok so I did a little literature review (i am a machine learning scientist and i used to work on a few NLP models in the past). There are 3 interesting implementations, ranging from easiest to implement to hardest, but with inversely proportionate performance:
- for GBoard/OpenBoard like performance, i think a simple n-grams approach is sufficient. Itcs a very low effort implementation because there is an awesome tutorial and opensource implementation for Android in Kotlin that dates from 2019: https://proandroiddev.com/android-predictive-keyboard-e6c9df01e527 and https://github.com/mccorby/SmartKeyboardNgram-Android
- an evolution of this classical model is called the Stupid Backoff N-grams model, first published in 2007 and apparently what SwiftKey and other keyboards with great predictive word capabilities used. Original T Brants 2007 paper: https://aclanthology.org/D07-1090.pdf implementation in R which mentions it was done in partnership with SwiftKey: https://github.com/RenatoPdosSantos/word-predictor . The implementation is quite complex,it will likely be easiir to reimplement from scratch, but the R implementation can allow to check tricky implementation details. Worth noting this is an early LLM (Large Language Model), a predecessor of GPT.
- last but a technological leap would be to implement a local GPT model. There exists what are called Distilled GPT models that are much smaller and run locally in browser in JS and on mobile devices such as Android phones, but to my knowledge they are all based on gpt 3, not 3.5 nor 4 which are MUCH better. Once there are some and they work on most devices within a reasonable timeframe (ie, computing predictions within milliseconds and not seconds), then this will blow all other models out of the water. Currently there are no keyboard providing such capabilities so this option is an open one for the future, but it's worth mentioning because for sure in a few years with the tech maturing, gpt will be the state of the art for words correction and prediction on mobile.
Thanks for the info! The feature I have in mind would be word correction from several large dictionaries (several languages, custom dictionary, emoji names, etc..) with a deterministic output.
I would drop word prediction for any of determinism, fast queries or compact dictionaries.
Ah then a simple distance metric can do the trick, such as levenshtein distance? The issue is that different distance metrics will allow to correct different kinds of errors, some support deletions (ie, a missing character) whereas others only support replacement (ie, a character replaced by another, but the word has the same length). Also I’m not sure about the algorithmic bound, usually they are n² for learning and then n for inference (n being the size of the dictionary) but maybe there are newer algos I am not aware of. (/EDIT: ah maybe with trees, this may make inference in log(n) instead of n)
Also worth noting that n-grams based approaches are deterministic as long as you use the same input dictionary, and you get very fast inference and word prediction for free in addition to word correction.
I was wondering whether it is useful to develop a custom SpellCheckerService (the part that provides the actual suggestions) or just "plugging in" to the Android spell checker framework to begin with. For the first part it could already be helpful for many users to just use already existing SpellCheckerServices (e.g. AOSP or OpenBoard) in the background by following https://developer.android.com/develop/ui/views/touch-and-input/spell-checker-framework#SpellCheckClient. If that is not sufficient, maybe a custom spell checker could be developed afterwards. I don't know if I am missing something here, but this in theorey seems like a good and easy first approach to implement spell checking without much overhead. ~~On the other side it seems like a custom user defined dictionary would have to be implemented by the client (here the keyboard) itself.~~ Of course all this requires thirdparty spell checkers to be installed on the system, but if that is not already the case, maybe it's a sacrifice users would make in order to get spell checking working. The TextServicesManager also provides functions to check if spell checkers are available. Maybe a dialog could be shown to inform users about the situation, however these functions have a pretty high API level. There are samples available at the AOSP project at https://github.com/Miserlou/Android-SDK-Samples/tree/master/SpellChecker. Maybe its worth a shot.
@deftkHD Good idea! With that implemented, someone can focus on the dictionary later.
I have a Languagetool server on my NAS. It would be nice If I could use a languagecorrection with that Server.
Maybe you can work for the language support together with https://anysoftkeyboard.github.io/ ?
After some research I'd suggest -as a best solution in this case- either a simple implementation of N-Gram with Smoothing as It strikes a good balance between simplicity, performance, and accuracy while keeping the implementation manageable for contributors. More involved approach would be a combination of both deterministic and non-deterministic approach, as @lrq3000 said inspiring from Gboard and equivalent used solutions out there, which combines multiple approaches, as an opt-in choice to the user for using external service, this is for word prediction solution, when it comes to spell checker I'd definitely go for Android's SpellChecker API -if it's available for the user- before going any further.
Here is a summary of one the discussions for reference (context as rag, `gpt-4o-2024-11-20`, `gemini-2.0-flash-exp` at the end):
Q: What's the best way to implement next word prediction using Android SDK?
Implementing next-word prediction using the Android SDK involves several considerations, including the choice of prediction model, dictionary resources, and integration with the input method framework. Below, I outline the best approaches, starting from simpler implementations to more advanced methods:
1. N-Grams Approach (Simplest and Deterministic)
The n-grams model is a probabilistic language model that predicts the next word based on the previous n-1 words. This is a lightweight, deterministic, and relatively easy-to-implement solution.
Steps:
-
Prepare an N-Gram Dataset:
- Use an open-source corpus to generate n-grams (e.g., trigrams or bigrams).
- Store the n-grams and their frequencies in a database or a dictionary-like structure.
-
Query the N-Gram Model:
- When a user types a word, extract the last
n-1words. - Search the n-gram database for matching entries and rank predictions by frequency.
- When a user types a word, extract the last
-
Integrate with the Keyboard:
- Display the top-ranked predictions in a suggestion row above the keyboard.
Tools & Resources:
- Tutorial: Android Predictive Keyboard
- Example Code: SmartKeyboardNgram-Android
Pros:
- Lightweight and fast.
- Deterministic and easy to implement.
- No external dependencies.
Cons:
- Limited understanding of context (short-term memory only).
- Performance relies heavily on the quality of the training corpus.
2. Stupid Backoff N-Grams (Improved Prediction)
The "Stupid Backoff" model is an optimized version of n-grams, as described in Google's 2007 paper. It uses a scoring mechanism to back off to lower-order n-grams when higher-order ones are unavailable.
Steps:
-
Build the Model:
- Extend the n-gram model with backoff probabilities.
- For example, if a trigram is not found, fall back to a bigram with a discounted score.
-
Implementation:
- Use the same querying mechanism as the n-gram model but add logic for backoff scoring.
-
Integration:
- Similar to the n-gram approach, integrate results into the keyboard UI.
Tools & Resources:
- Paper: Stupid Backoff (2007)
- Example Implementation: R Implementation
Pros:
- Better predictions than basic n-grams.
- Still deterministic and lightweight.
- Suitable for mobile devices.
Cons:
- Slightly more complex to implement.
- Requires fine-tuning backoff parameters.
3. Android SpellChecker Framework (Leverage Existing APIs)
If prediction is not the primary focus, the Android SpellChecker framework can be used for word suggestions and corrections.
Steps:
-
Use TextServicesManager:
- Access the system's
SpellCheckerServicevia theTextServicesManager. - Query for suggestions based on the input text.
- Access the system's
-
Integration:
- Display suggestions from the spell checker in a prediction row above the keyboard.
-
Custom Implementation:
- If the built-in spell checker is insufficient, implement a custom
SpellCheckerServicefor more control.
- If the built-in spell checker is insufficient, implement a custom
Tools & Resources:
- Documentation: Spell Checker Framework
- Example Code: AOSP Spell Checker Samples
Pros:
- Minimal effort for basic suggestions.
- Can leverage existing spell checkers like OpenBoard or AOSP.
Cons:
- Limited to installed spell checkers on the user's device.
- Not suitable for advanced prediction models.
4. Neural Network Models (Advanced Prediction)
Neural networks, such as LSTMs, Transformers, or GPT-based models, are state-of-the-art for language prediction. However, they require significantly more computational resources.
Steps:
-
Choose a Lightweight Model:
- Use a pre-trained, lightweight model (e.g., DistilGPT or TinyBERT) that can run on mobile devices.
- Alternatively, use a distilled version of GPT-2 for local predictions.
-
Inference on Android:
- Use TensorFlow Lite (TFLite) or ONNX Runtime to run the model on the user's device.
- Implement a pipeline to tokenize input text, generate predictions, and detokenize output.
-
Integration:
- Display the top predictions in a suggestion row.
Tools & Resources:
- TensorFlow Lite: TFLite Guide
- Distilled GPT Models: DistilGPT
- Android ML Kit: Google ML Kit
Pros:
- High-quality predictions with contextual understanding.
- Capable of handling complex grammar and rare words.
Cons:
- Resource-intensive.
- Requires expertise in machine learning and NLP.
- Larger app size.
5. Hybrid Approach (Prediction + Correction)
Combine deterministic methods (n-grams or spell checker) with neural network models for a hybrid solution.
Steps:
-
Use N-Grams for First Prediction:
- Generate quick predictions using an n-gram model.
-
Neural Model for Refinement:
- Use a lightweight neural model to re-rank or refine predictions.
-
Fallback to Spell Checker:
- If no prediction is available, query the spell checker for suggestions.
Pros:
- Balances performance and computational cost.
- Provides better predictions than n-grams alone.
- More responsive on low-end devices.
Cons:
- Increased implementation complexity.
Recommendations:
- For Simplicity: Start with an n-grams model. It is straightforward, efficient, and deterministic, making it a good fit for the current goals of the Unexpected Keyboard project.
- For Advanced Features: Explore the hybrid approach, combining n-grams with neural networks for better predictions without sacrificing performance.
Additional Tips:
- Use efficient data structures like tries or prefix trees for storing dictionaries.
- Cache predictions for frequently used phrases to improve response times.
- Allow users to train or customize the dictionary for better personalization.
By starting with simpler models (n-grams) and iteratively improving, you can balance user experience with development complexity.
Q: Is "N-Gram with Smoothing" best solution in this case?
"N-Gram with Smoothing" can be considered a good solution for basic next-word prediction in the context of the Unexpected Keyboard, but whether it is the best solution depends on several factors. Let's evaluate its suitability based on the requirements and constraints outlined in the repository and issue discussion.
Why N-Gram with Smoothing is a Good Fit
-
Deterministic Output:
- As stated by the repository maintainer, Julow, the desired feature prioritizes deterministic output over probabilistic or AI-driven models. N-Gram with smoothing (e.g., Stupid Backoff, Laplace Smoothing) is still deterministic as it ranks predictions based on pre-defined probabilities or frequencies from a training corpus.
-
Compact Dictionaries:
- N-Grams are relatively lightweight compared to more advanced models like GPT-based prediction. They rely on compact dictionaries of word sequences and their frequencies, which align with the goal of having fast queries and small memory usage.
-
Low Complexity:
- N-Grams with smoothing are computationally simple to implement and require significantly fewer resources than deep learning models. This is ideal for a lightweight keyboard like Unexpected Keyboard, which avoids heavy processing.
-
Performance:
- With smoothing techniques (e.g., Stupid Backoff), N-Grams can handle unseen word combinations gracefully by falling back to lower-order N-Grams (e.g., from trigrams to bigrams or unigrams). This provides acceptable levels of accuracy for everyday use.
-
Customizability:
- You can train the model with custom dictionaries (e.g., domain-specific terms, user dictionaries, emoji names) and adapt it to multiple languages. This aligns with the project's emphasis on supporting multiple large dictionaries.
Limitations of N-Gram with Smoothing
While N-Gram with smoothing is a good baseline, it has some limitations that might make it less ideal for advanced use cases:
-
Limited Context Understanding:
- N-Grams only consider a fixed number of previous words (e.g., 2 for bigrams, 3 for trigrams). This limits its ability to understand long-term dependencies or complex sentence structures.
-
Accuracy:
- Compared to modern neural models, N-Gram predictions are less accurate, especially for predicting rare or complex word combinations.
-
Scalability:
- As the size of the dictionary grows (e.g., for multi-language support), the model’s size and query time can increase. Optimized data structures (like tries or prefix trees) can mitigate this but add extra complexity.
-
No Semantic Understanding:
- N-Grams operate purely on word frequencies and lack semantic understanding. For example, it cannot infer that "dog" and "cat" are related concepts.
-
Static Training Data:
- Unless the model is periodically updated with user input or new corpora, its predictions may become outdated over time.
Alternatives to N-Gram with Smoothing
While N-Gram with smoothing is a solid starting point, you might consider alternatives or enhancements depending on the desired trade-offs between complexity, accuracy, and performance:
1. SpellChecker Framework Integration
- Use Case: If the focus is more on word correction than prediction.
- Advantages:
- Leverages Android's built-in
SpellCheckerServicefor suggestions. - Minimal development effort; mostly integrates existing APIs.
- Deterministic and lightweight.
- Leverages Android's built-in
- Disadvantages:
- Limited to installed system spell checkers.
- Less accurate for next-word prediction.
2. Hybrid N-Gram + Trie
- Description: Use a trie (prefix tree) to efficiently store and query N-Grams. Combine this with smoothing techniques.
- Advantages:
- Reduces the memory footprint of N-Gram models.
- Speeds up queries by leveraging the trie’s structure for prefix matching.
- Still lightweight and deterministic.
- Disadvantages:
- Slightly more complex implementation.
- Limited to N-Gram-level accuracy.
3. Neural Networks (e.g., LSTM or Transformer)
- Use Case: If higher accuracy and contextual understanding are required.
- Advantages:
- Significantly better at understanding long-term dependencies and semantics.
- Can adapt to complex sentence structures and rare word combinations.
- Disadvantages:
- Computationally expensive.
- Requires significant storage and memory for the model.
- Non-deterministic outputs and larger app size.
- Likely overkill for a lightweight keyboard.
4. User-Centric Hybrid Solution
- Combine N-Gram with Smoothing and User Dictionaries:
- Use N-Grams for general predictions.
- Allow users to add custom words or phrases to a lightweight trie-based user dictionary.
- Advantages:
- Improves personalization and prediction accuracy.
- Lightweight and deterministic.
- Disadvantages:
- Requires additional UI and logic for managing user dictionaries.
Is N-Gram with Smoothing the Best Solution?
Given the context of the Unexpected Keyboard project’s goals:
- Lightweight.
- Fast queries.
- Deterministic behavior.
- Support for large dictionaries.
N-Gram with Smoothing is one of the best solutions for this specific use case. It strikes a good balance between simplicity, performance, and accuracy while keeping the implementation manageable for contributors.
Implementation Tips for N-Gram with Smoothing
-
Data Structure:
- Use a trie or hashmap to store N-Grams for efficient querying.
- Example:
{("I", "am"): {"happy": 10, "sad": 2}}for a trigram model.
-
Smoothing Technique:
- Implement Stupid Backoff:
- If a trigram is unavailable, fall back to a bigram (discounted by a factor, e.g., 0.4).
- If a bigram is unavailable, fall back to a unigram.
- Formula:
P(w|context) = { count(w|context) / count(context) if count(context) > 0 alpha * P(w|lower-order-context) otherwise }
- Implement Stupid Backoff:
-
Training Corpus:
- Use open datasets like Common Crawl or Wikipedia Dumps for generating N-Grams.
- Pre-process the text to remove noise and tokenize it for better results.
-
Storage Optimization:
- Compress the N-Gram dictionary using tries or prefix trees.
- Serialize the dictionary for fast loading (e.g., using JSON or binary formats).
-
UI Integration:
- Show the top 3 predictions in a suggestion row above the keyboard.
- Allow users to tap on a prediction to auto-complete the word.
-
Language Support:
- Train separate N-Gram models for each language.
- Dynamically load the appropriate language model based on the current keyboard layout.
Conclusion
If you prioritize simplicity, speed, and deterministic behavior, N-Gram with Smoothing is the best solution for next-word prediction in the Unexpected Keyboard. For more advanced features or personalization, you can extend this with user dictionaries or consider hybrid approaches.
Q: How can google's keyboard "Gboard" implement "next word prediction" in offline mode effectively?
[gpt-4o-2024-11-20] ----------------------------------------------------------------------------
Google's Gboard is a highly advanced keyboard that provides next word prediction in both online and offline modes. While the exact implementation details are proprietary, we can infer its offline mode capabilities based on known technologies and techniques in Natural Language Processing (NLP) and mobile optimization. Below is an analysis of how Gboard likely implements next-word prediction effectively in offline mode:
Key Components of Gboard's Offline Next-Word Prediction
1. Compact Language Models
Gboard likely uses compact models that balance prediction accuracy and resource efficiency:
- N-Gram Models with Smoothing:
- Gboard may store precomputed N-Grams (e.g., bigrams or trigrams) with frequencies or probabilities for common phrases.
- Smoothing techniques like Stupid Backoff or Kneser-Ney Smoothing handle rare or unseen combinations.
- Neural Language Models (Distilled):
- Advanced offline predictions use distilled neural models, such as LSTM or Transformer-based models (e.g., TinyBERT, DistilBERT, or MobileBERT).
- These models are trained on large corpora and then distilled to smaller versions that can run efficiently on mobile devices using frameworks like TensorFlow Lite.
2. Efficient Data Structures
To ensure fast lookups and minimal memory usage, Gboard likely leverages optimized data structures:
- Trie or Prefix Tree:
- Stores N-Gram probabilities compactly.
- Enables efficient prediction by matching prefixes of the previous typed words.
- Finite State Transducers (FSTs):
- Used for both word prediction and correction.
- FSTs allow efficient storage and traversal of large dictionaries, making it possible to store millions of word sequences in a small memory footprint.
3. Personalization
Gboard dynamically learns user behavior to improve offline predictions:
- On-Device Training:
- Tracks the user's typing habits locally (e.g., frequently used phrases, slang, or names).
- Updates a user-specific N-Gram or neural model stored on the device while respecting privacy.
- User Dictionaries:
- Custom words and phrases (e.g., names, locations) are added to a local dictionary that integrates with the prediction system.
4. Context Awareness
Gboard uses context to improve next-word predictions:
- POS Tagging and Language Modeling:
- Combines word predictions with Part-of-Speech (POS) tagging to ensure grammatically correct suggestions.
- Example: After "I am", the next word is likely a verb ("going," "happy") rather than a noun.
- Semantic Context:
- Neural models (e.g., LSTMs or Transformers) capture context across longer word sequences for better predictions.
- Example: After "I have a", it predicts "dream" or "dog" based on semantic understanding.
5. Multilingual Support
Gboard supports multiple languages seamlessly:
- Language-Specific Models:
- Pre-trained N-Gram or neural models for each language are stored on the device.
- Dynamic Language Switching:
- Automatically detects the language being typed and switches to the appropriate model without user intervention.
6. Localized Dictionaries
Gboard uses large, pre-compiled dictionaries tailored to specific languages and regions:
- Dictionaries Compressed with FSTs:
- Offline dictionaries include common words, phrases, and idiomatic expressions.
- For example, "How are you" in English or "¿Cómo estás?" in Spanish.
7. Hardware Optimization
Offline predictions are optimized for mobile hardware to ensure smooth performance:
- Quantized Neural Models:
- Gboard uses quantized versions of neural models (e.g., 8-bit weights) to reduce computational overhead while maintaining accuracy.
- Inference Frameworks:
- Frameworks like TensorFlow Lite or ONNX Runtime efficiently run predictions on low-power CPUs or NPUs (Neural Processing Units) available on modern smartphones.
8. Caching and Prediction Pruning
To improve efficiency, Gboard employs caching and pruning strategies:
- Prediction Caching:
- Frequently used predictions are cached for instant retrieval.
- Example: After typing "Good", "morning" is cached as the top suggestion.
- Pruning Unlikely Predictions:
- Filters out low-probability predictions to reduce computational effort and avoid cluttering the suggestion bar.
9. Handling Rare Words
Gboard likely uses fallback mechanisms for rare or unknown words:
- Subword Tokenization (e.g., Byte Pair Encoding - BPE):
- Breaks down rare or unknown words into smaller units (e.g., "unbelievable" → "un", "believe", "able").
- Allows the model to predict parts of words instead of failing entirely.
- Edit Distance Matching:
- Matches user input to similar words in the dictionary using algorithms like Levenshtein Distance.
How it Works in Practice
- User Input:
- User types a sequence of words, e.g., "I am going to".
- Context Analysis:
- The model analyzes the context of the preceding words.
- If using an N-Gram model, it looks for matches like "going to [X]".
- If using a neural model, it encodes the input context into a vector representation.
- Prediction Generation:
- The model queries the dictionary or neural network to generate the top predictions.
- Example: "school," "work," "the store."
- Ranking Predictions:
- Predictions are ranked based on:
- Probability (from N-Grams or neural models).
- Relevance to user behavior (personalization).
- Grammar and POS tagging.
- Predictions are ranked based on:
- Display to User:
- The top 3-5 predictions are displayed in the suggestion bar.
Benefits of Gboard's Approach
- Accuracy:
- Combines deterministic methods (N-Grams, FSTs) with contextual understanding (neural models) for high-quality predictions.
- Performance:
- Optimization techniques (quantization, caching) ensure predictions are fast and responsive.
- Personalization:
- Learns user-specific patterns for better predictions over time.
- Privacy:
- Offline predictions respect user privacy, with no data sent to external servers.
Challenges and Solutions
1. Balancing Accuracy and Resource Usage
- Challenge: Limited computational power and storage on mobile devices.
- Solution: Use compressed data structures (e.g., tries, FSTs) and quantized neural models.
2. Handling Multilingual Input
- Challenge: Supporting multiple languages without bloating storage.
- Solution: Store language-specific models and dynamically load them as needed.
3. Personalization Without Privacy Concerns
- Challenge: Collecting user data without compromising privacy.
- Solution: Perform all personalization and model updates locally.
How to Implement Similar Offline Prediction
If you're building a lightweight keyboard like Gboard with offline next-word prediction, here’s a roadmap:
-
Start with an N-Gram Model:
- Use bigrams or trigrams with smoothing (e.g., Stupid Backoff).
- Store the N-Grams in a trie or FST for fast lookups.
-
Add Personalization:
- Track frequently used words/phrases and add them to a local dictionary.
-
Optimize for Mobile:
- Compress dictionaries and use quantized models.
- Leverage frameworks like TensorFlow Lite for efficient inference.
-
Implement Context Awareness:
- Enhance predictions with POS tagging and semantic analysis.
-
Iterate and Improve:
- Add neural models (e.g., LSTMs, Transformers) for more advanced predictions.
- Continuously optimize for performance and storage.
Conclusion
In offline mode, Gboard effectively combines compact language models, optimized data structures, and on-device personalization to deliver accurate and responsive next-word predictions. While advanced neural models provide contextual understanding, classic N-Gram techniques with smoothing are still at the core of its efficiency. For a lightweight keyboard like Unexpected Keyboard, starting with an optimized N-Gram approach and gradually incorporating neural models is a practical path forward.
[gemini-2.0-flash-exp] ---------------------------------------------------------------------------- That's a great question! Gboard's offline next-word prediction is a complex feature, but it relies on a combination of techniques to achieve effectiveness within the constraints of a mobile device. Here's a breakdown of how it likely works:
Key Techniques Employed by Gboard:
-
Compressed N-gram Models with Smoothing:
- Core Component: Like the previously discussed approach, Gboard most likely uses n-gram models as its foundation. These models store probabilities of word sequences.
- Compression: The dictionaries need to be highly compressed to fit in the limited space of a mobile device. They likely use techniques like:
- Quantization: Reducing the precision of stored probabilities.
- Pruning: Removing less frequent n-grams.
- Data Structure Optimization: Efficient data structures for fast lookups.
- Smoothing: As discussed before, smoothing (likely Kneser-Ney smoothing or similar) is absolutely crucial for handling unseen word combinations.
-
Local Recurrent Neural Networks (RNNs) or Transformers:
- Supplemental Layer: While n-grams form the core, Gboard likely uses smaller, highly optimized neural networks (RNNs or simplified Transformers) to provide additional contextual understanding. These models are also compressed to minimize their size.
- Contextual Refinement: These models are trained to refine the top n-gram predictions, especially in complex sentences or phrases.
- Personalization: RNNs and Transformers can be trained to adapt to the user's typing style and vocabulary (but in an offline way, without sending any data to Google servers).
- Handling Long-Range Dependencies: While n-grams are good for local context, neural models can better handle longer-range dependencies in sentences.
-
Mobile-Optimized Model Architectures:
- Lightweight Design: Gboard uses highly optimized and lightweight model architectures designed for mobile CPU/GPU execution, not full-fledged server-grade models.
- Quantized Models: Neural network weights and activations are quantized to reduce model size and computational cost.
- Efficient Implementations: The implementations use optimized code libraries for mobile platforms (e.g., TensorFlow Lite).
-
On-Device Training:
- Local Adaptation: Gboard uses on-device learning to adapt to the user's vocabulary and frequently used phrases. This improves prediction accuracy over time.
- No Central Data Upload: This training is done offline on the user's device, with no data being sent to Google servers.
-
Caching:
- Frequently Used Predictions: Predictions for frequently used words and phrases are cached for even faster access.
- Adaptive Cache: The cache is updated based on usage patterns.
-
Language-Specific Models:
- Different Models per Language: Gboard has separate models for each language, as language statistics and grammar vary a lot. This avoids trying to apply english grammar rules to other languages.
- Specialized Models: Some languages might also have more specialized models to handle specific features like compound words.
-
User Input Context:
- Current Input: The model considers the current word being typed and previous words to predict the next one.
- Other Input Context: The model might look at other data provided in the input field when possible (such as the type of input field) to better provide context for predictions.
How These Techniques Enable Offline Effectiveness:
- Small Model Size: Compression techniques keep the models small enough to fit on a mobile device without excessive storage demands.
- Fast Execution: Optimized model architectures and efficient implementations ensure that predictions are generated quickly without causing lag.
- Personalization: On-device training and caching make the model more accurate over time.
- Contextual Awareness: Using neural models in addition to n-grams provides the necessary contextual awareness to make better suggestions than a simple n-gram based approach.
- Language Support: Using language specific models ensures high quality predictions for a very wide range of languages.
Key Differences from Server-Based Prediction:
- Model Complexity: Offline models are smaller and less complex than server-based models due to resource constraints.
- Training Data: Offline models are trained using publicly available data (for each supported language) and are further refined on the device itself by the user's typing habits, whereas server-based prediction can use a much larger database that also includes all the data typed by users that agreed to share their data with the company.
- Hardware Limitations: Mobile devices have far less computational power than Google's servers, and require much more optimized approaches.
In Summary
Gboard's offline next-word prediction is effective because it combines:
- Compressed and optimized N-gram models with advanced smoothing techniques as the core of the system
- Smaller, mobile-optimized neural networks (RNNs/Transformers) for context and personalization
- On-device learning for personalization
- Efficient storage and caching techniques
These techniques allow for a smooth and personalized typing experience even when there is no internet connection.
Keep in mind that Google is constantly working on improving the models. So, new techniques may be added in the future. Also, these details are based on public information and what can be inferred from how these types of models typically operate.
If I can make a suggestion... One way to break this into smaller tasks would be to start with the simplest implementation that delivers useable results and then add more features as independent updates. For example, predictive/corrective suggestions based only on the words in the current document would already save piles of typing time, without having to build the database infrastructure, multiple language support, personal dictionaries, etc.
Hi, I am a rookie scripter just starting out, and am here to say I really like using Unexpected Keyboard as my daily driver Unfortunately the lack of spell check is forcing me to use another keyboard. I tire of having to edit my messages due to spelling errors.
I personally do not like autocorrect, suggestions are fine with me though, but at they very least, I need to see when there are errors in spelling. Grammar and punctuation suggestions would be cool. For emphasis, the bare minimum needed for me and Ivd assume for most users is spell check.
To my understanding Unexpected Keyboard was developedfor coding, it makes sense to me that spell check may be undesirable in development apps. Perhaps a key could be added to the keyboard, or a button in the toolbar to toggle spell check to ensure this does not become a problem. On the other hand, there is an opportunity here to make linting and syntax checking available via keyboard. That could be so powerful in contexts where our text editor or webapp does not offer such features. Probably big enough of project to warrant it's own repo though and it sounds like MVP is more realistic for the primary dev on this repo.