David Thrower issues

Results 107 issues of


                                            David Thrower

add-a-jupyter-notebook-for-llm-training

Add the Jupyter notebook of the workflow after merge in #277 and test that it works from the main branch. [train_an_llm_with_cerebros(3).ipynb](https://github.com/user-attachments/files/23657223/train_an_llm_with_cerebros.3.ipynb)

kind/documentation

triage/unresolved-predecessors

audience/technical

triage/intermediate-priotrity

kind/demo

attempt-to-imporve-parameters-on--dev-branch-275

## TLDR Branch 275 is functional and would be ready to merge, however, the Stage I-b is getting a less desirable Perplexity score. # Tasks ## From #275 try these...

kind/enhancement

status/ready-pending-tests

triage/high-priority

kind/validation

audience/technical

kind/text-generative-ai

de-clutter-directories-of-this-project

# TLDR The root directory and other directories in this need to be de - cluttered. This may be confusing for users and could be down-rating the repo in the...

2025-10-24--demo-text-generation-verdi-no-merge

# Branch for demonstration of CICD workflows only, **do not merge** - A client desires a demo of the text generation workflow - This will trigger the workflow for demo...

no-merge/demo-of-workflows

add-phase-to-metadata-in-logged-text-samples

# From branch #270 ## Complete the information on the generated text samples being logged on the LLM Phase-I-a and Phase-I-b training script: - [ ] Add to logged text...

traige/good first issue

triage/high-priority

kind/validation

kind/text-generative-ai

from-269-vanilla scale-hpo-on-server

# From #269 ## Modification of #269 to run this on the server with minimal scale up (maybe 100 phase I-a samples and 300 phase I-b samples ... )

model-recompiled-after-phase-1-small-scale-hpo-267

# Take #267 and run an HPO study otherwise identical to #268 , however with the phase I-a model recompiled to reset the optimizer before running Phase I-b training. ##...

small-scale-hpo-test-with-267

# Settigns to test #267 on server

refactor-generator-dataset-for-llm

From #266 , merge target: #266 ``` class SampleExpansionGenerator: def __init__(self, raw_text_samples, tokenizer, sample_expansion_batch_size=50, model_batch_size=10, prompt_length_0=PROMPT_LENGTH, max_seq_length=MAX_SEQ_LENGTH, vocabulary_size=VOCABULARY_SIZE): self.raw_text_samples = raw_text_samples self.tokenizer = tokenizer self.sample_expansion_batch_size = sample_expansion_batch_size self.model_batch_size = model_batch_size...

status/ready-pending-tests

move-llm-components-to-a-package

Move LLM components to a package to make model serialization more practical. # From #260