[Docs]: General LLM Guide
π Current State of Documentation
As mentioned in a previous journal, I believe it could be beneficial to have a guide for those wishing to start LLM development! This is a general list of journals I could create; I'm willing to create all of these from scratch and it would take a month and a half to get most done at the very most (not counting the Advanced Usages which might take a bit longer because of the data needed for two of them)!
These do also use a few concepts/prototypes I made for a program I made called Project Replicant (such as Engels or the understanding 3D using a CAD like database) so I hope that's alright!
I also do want to know if there's a specific API you guys wish for me to use; I do want to use something like huggingface (which offers a free tier)! I would suggest this as bouncing around different APIs early on might make understanding exactly what's being done as well as why harder!
π Suggested Improvement
My idea for a guide goes as follows
LLM Fundamentals and Advanced AgentOps Implementation
Basic Usages
- Text Generation
- Finishing a sentence or creating a paragraph based on a prompt
- Generating a short story based on a Nier Style Sentence
- Finishing the second half of a sentence based on the emotion a user wants to convey
- Classifying Data Using an LLM
- Based on preset categories
- Summarizing sentences into positive, neutral, and negative sentiments
- Inferring what category an item may be based on its details
- Summarizing Information
- Basic summarization for now (at advanced levels Engels)
- Summarizing general articles about multiple topics
- Summarizing conversation and keeping the most important details (People, places, and things + names and dates)
- Adding Context to History
- Adding context to our history based on a prompt (Advanced levels custom history)
- Having an AI finish a task and adding to history before asking a question that takes the previous context into account
- Using a Local Search Engine System
- Giving context and taking input (Challenge for basic level)
Intermediate Usages
- Developing a Chatbot
- First with single user, then with multi-user, then with multi-chatbot and multi-user
- Taking one user input (standard)
- Formatting the inputs to give context to who a user is (with an introduction prompt)
- Simulating multiple chatbots conversing at the same time to different users before going back to talk to one
- Fine-Tuning Chatbots
- Fine-tuning chatbot for better answers using a simple CSV sheet
- Changing the tone an LLM responds in with CSV data
- Giving a chatbot more context through a CSV sheet with Q/A
- Dataset Creation
- Yes/No-based, then text-to-text-based, then complete generation from scratch
- βIs this a _ ?β
- Turning a description into a list of questions and answers
- Generating complete text from scratch (A few ideas here)
- Grouping Outputs
- Grouping outputs into premade categories (generating context then packaging it)
- Taking the output from generated text and using tools to help sort it
- Sorting information from a conversation into a specially made database (Challenge)
Advanced Usages
- RAG-Based Information Searching
- API-based, Google Search-based, Multi-database
- Reflecting on the date/time with a free API
- Using the Google search snippets to get information
- Using multiple CSV files as context for an LLM
- Email-Based Assistant
- Using LLM to create emails
- Finding a certain type of data (CTO) and generating custom messages for each
- Determining safety risks based on LLM + API search
- Stylized Text Generation
- Documentation, DnD campaign, etc.
- Formatting Conversations
- Formatting conversations into a specific JSON format for recalling later
- Taking notes and formatting them into a more professional state
- Research Studies
- Converting chat history to shortened text and using as context for longer chatbot context with less worry about tokens
- Engels, an AI summary language I developed (Showing how to create a dataset and implement it)
- LLM-Ran Town
- Creating an LLM-ran town, visualizing it in Unity, and using it to train around different goals
- Goals such as trying to get LLMs to speak to each other as often as possible, remembering context from long ago, or keeping conversation minimal for a DB
- Creating an LLM-ran town, visualizing it in Unity, and using it to train around different goals
- Interacting with 3D Space
- Having an LLM interact with 3D space based on semantic + CAD-like data and Unity AR/VR
- (This data is already being created by a friend and I using a 3D rooms generator before being moved to 3D)
- Teaching LLM Rulesets
- Rulesets for games and long-term rulings (such as Chess and Checkers, also stopping the AI from sharing its context through anti-examples)
- Custom AgentOps Implementations
- Creating custom implementations for AgentOps (I have been testing this out in relation to Gemini; I believe my mistake wasnβt in the code itself but rather mixing up the output delta block with another term. Still, to be safe, I plan on restarting)
π Affected Documentation Pages
No response
π Additional Context
No response
π€ Contribution
- [x] Yes, I'd be happy to submit a pull request with these changes.
- [ ] I need some guidance on how to contribute.
- [ ] I'd prefer the Agentops team to handle this update.
Adding a general section to the start talking more about LLMs logic and AgentOps; was talking to a few people about the tool and gauged a few problems they had while trying to learn how to use it.
Should be sharing a draft markdown for the first and second page in 2 days at most!