David Thrower

Results 107 issues of David Thrower

Add the Jupyter notebook of the workflow after merge in #277 and test that it works from the main branch. [train_an_llm_with_cerebros(3).ipynb](https://github.com/user-attachments/files/23657223/train_an_llm_with_cerebros.3.ipynb)

kind/documentation
triage/unresolved-predecessors
audience/technical
triage/intermediate-priotrity
kind/demo

## TLDR Branch 275 is functional and would be ready to merge, however, the Stage I-b is getting a less desirable Perplexity score. # Tasks ## From #275 try these...

kind/enhancement
status/ready-pending-tests
triage/high-priority
kind/validation
audience/technical
kind/text-generative-ai

# TLDR The root directory and other directories in this need to be de - cluttered. This may be confusing for users and could be down-rating the repo in the...

# Branch for demonstration of CICD workflows only, **do not merge** - A client desires a demo of the text generation workflow - This will trigger the workflow for demo...

no-merge/demo-of-workflows

# From branch #270 ## Complete the information on the generated text samples being logged on the LLM Phase-I-a and Phase-I-b training script: - [ ] Add to logged text...

traige/good first issue
triage/high-priority
kind/validation
kind/text-generative-ai

# From #269 ## Modification of #269 to run this on the server with minimal scale up (maybe 100 phase I-a samples and 300 phase I-b samples ... )

# Take #267 and run an HPO study otherwise identical to #268 , however with the phase I-a model recompiled to reset the optimizer before running Phase I-b training. ##...

# Settigns to test #267 on server

From #266 , merge target: #266 ``` class SampleExpansionGenerator: def __init__(self, raw_text_samples, tokenizer, sample_expansion_batch_size=50, model_batch_size=10, prompt_length_0=PROMPT_LENGTH, max_seq_length=MAX_SEQ_LENGTH, vocabulary_size=VOCABULARY_SIZE): self.raw_text_samples = raw_text_samples self.tokenizer = tokenizer self.sample_expansion_batch_size = sample_expansion_batch_size self.model_batch_size = model_batch_size...

status/ready-pending-tests

Move LLM components to a package to make model serialization more practical. # From #260