llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Train Text from scratch

Open xaedes opened this issue 1 year ago • 15 comments

I improved the training process (https://github.com/ggerganov/llama.cpp/pull/1360) by some orders of magnitudes, replacing naive backward passes with dedicated operations. The training can now also use flash attention to support bigger context sizes.

There is a self contained example which allows to train small llama models from scratch. The vocabulary that is used will be loaded from a source llama model. To be able to resume training from previous runs a training checkpoint file is used to store the optimizer context and model weights. After training the loaded checkpoint can be exported as llama compatible model file in F32 format.

List of all new operations:

  • GGML_OP_REPEAT_BACK : To get rid of 10 operations per repeat backward pass.
  • GGML_OP_OUT_PROD : Similar to MAT_MUL, but the second dimensions must be equal instead of the first dimension. Internally uses vec_mad instead of vec_dot which is used by matmul.
  • GGML_OP_SOFT_MAX_BACK : To get rid of big intermediate matrices, reducing runtime from quadratic to linear and eliminate quadratic memory overhead, now there is none.
  • GGML_OP_FLASH_ATTN_BACK : Flash attention combines softmax(K*Q*V) into one operation. Using this with the new backward pass saves a lot of operations and memory overhead during training.
  • GGML_OP_CROSS_ENTROPY_LOSS : Combines -sum(log(softmax(a))*b) into one dedicated operation. As the vocabulary is quite big (32k) the loss function deals with big matrices, as dedicated operation a lot of memory overhead can be saved.
  • GGML_OP_CROSS_ENTROPY_LOSS_BACK : Backward pass of cross entropy loss.

Changes to optimizers:

AdamW was easy to implement by changing Adam. When the corresponding parameter is zero it mathematically simplifies to the regular Adam optimizer. So instead of implementing a whole new optimizer, AdamW weight decay was implemented directly in Adam.

Adam tracks statistics about the gradients, which are very important for training. It was necessary to persist the state of the optimizer between ggml_opt calls. For this I added struct ggml_opt_context, ggml_opt_init() and ggml_opt_resume(). The regular ggml_opt() call will now internally create a new opt context, initialize it and then use resume to optimize using this fresh context. This moves all the memory allocation done by the optimizers into ggml_opt_init.

I investigated the use of flash_ff, but it seems to implement a different feedforward than the one used in llama. Having a dedicated operation for the feedforward of llama could also save a lot of memory overhead, allowing bigger n_embd and n_ff. Might be worth to look into.

There is still a lot of unnecessary memory overhead. In the llama eval function similar overhead is avoided using scratch buffers. Would be good to find a way to do similar with the (automatically generated) backward pass.

Other noteworthy changes:

  • added llama api function to get vocabulary data from llama context
  • added a fix to llama_model_load_internal for models with n_layer<32 to be recognized as MODEL_7B, so we can load small self trained models - they are probably smaller than 7B.
  • bugfix in llama_sample_token_mirostat_v2 when all candidates are filtered out, which can happen with freshly trained models

Also see:

  • https://github.com/ggerganov/ggml/issues/8#issuecomment-1518465663
  • https://github.com/ggerganov/llama.cpp/pull/1360

xaedes avatar May 30 '23 16:05 xaedes

We may need the new Sophia Optimizer for a 2X increase in training speed compared to Adam.

klosax avatar May 30 '23 19:05 klosax

This is amazing!

JFYI, give it a shot locally, compiling in Linux fails with:

/home/mudler/_git/llama.cpp-train/examples/train-text-from-scratch/train-text-from-scratch.cpp:1655:35: error: use of ‘auto’ in lambda parameter declaration only available with ‘-std=c++14’ or ‘-std=gnu++14’      
 1655 |     std::sort(begin, end, [&vals](auto a, auto b){                                                                                                                                                           
      |                                   ^~~~                                                                                                                                                                       
/home/mudler/_git/llama.cpp-train/examples/train-text-from-scratch/train-text-from-scratch.cpp:1655:43: error: use of ‘auto’ in lambda parameter declaration only available with ‘-std=c++14’ or ‘-std=gnu++14’      
 1655 |     std::sort(begin, end, [&vals](auto a, auto b){                                                                                                                                                           
      |                                           ^~~~                                                                                                                                                               
/home/mudler/_git/llama.cpp-train/examples/train-text-from-scratch/train

mudler avatar May 30 '23 21:05 mudler

JFYI, give it a shot locally, compiling in Linux fails with:

Oh, thanks for pointing that out. Pushed a fix for this to the branch, github seems a bit sluggish today, doesn't show up yet.

https://github.com/ggerganov/llama.cpp/commit/7f172c1070d514e450e002e430957773093572ba

xaedes avatar May 30 '23 23:05 xaedes

I needed to add #include <climits> to train-text-from-scratch.cpp due to:

/tmp/mount/train-git/llama.cpp/examples/train-text-from-scratch/train-text-from-scratch.cpp:1512:37: error: ‘INT_MAX’ was not declared in this scope
 1512 |     GGML_ASSERT(size >= 0 && size < INT_MAX);

on my system, but then it compiles and runs:

root@1ccc33b2ee49:/tmp/mount/train-git/llama.cpp/build/bin# ./train-text-from-scratch -h
usage: ./train-text-from-scratch [options]

options:
  -h, --help                 show this help message and exit
  --vocab-model FNAME        model path from which to load vocab (default 'ggml-vic7b-uncensored-q4_0.bin')
  --train-data FNAME         path from which to load training data (default 'shakespeare.txt')

Edit: And trains with (mostly) default values:

# wget https://raw.githubusercontent.com/brunoklein99/deep-learning-notes/master/shakespeare.txt
# ./train-text-from-scratch --vocab-model Wizard-Vicuna-7B-Uncensored.ggmlv3.q5_1.bin --mem-compute 12
...
main: opt->params.adam.sched 0.82000
Example 7, opt iter 98
error_before_opt: 57.542198
error_after_opt:  51.464439
used_mem_before_opt: 1836849504 bytes
used_mem_after_opt:  4991577136 bytes
Generating 1024 tokens.

johnson442 avatar May 31 '23 05:05 johnson442

We may need the new Sophia Optimizer for a 2X increase in training speed compared to Adam.

I am trying to use it

Sovenok-Hacker avatar May 31 '23 11:05 Sovenok-Hacker

Hi @xaedes, thanks for the example! It builds and runs without errors on my pc. I'm trying to follow the default training example. Can you tell if the following files are the correct inputs:

  • https://huggingface.co/TheBloke/Wizard-Vicuna-7B-Uncensored-GGML/tree/main
  • https://github.com/brunoklein99/deep-learning-notes/blob/master/shakespeare.txt

The following image shows the last 'samples after optimization' at the end of the loop, is this the expected output?

image

The text generation itself at the end doesn't look that bad:

image

KASR avatar Jun 01 '23 07:06 KASR

Can you tell if the following files are the correct inputs:

You can use any llama model as input, just the vocabulary is needed of it. The one you linked is fine. As text input you can use whatever you like. Shakespeare was just an example (nanoGPT also had training on shakespeare, this is where it comes from). Your linked shakespeare file is quite good coherent text material.

Yes, weird looking "best samples" with repetitive tokens like "old" indeed occur during early training.

As you said, the generated samples afterwards are better. Especially when generating using main with the exported model, as main has better sampling parameter defaults.

The first time I encountered the repetitive "best samples", I was also worried that it got stuck in some local minimum, but eventually it will overcome it. "old" just seems to be a frequent token in the training data it saw, so it learned it first. I also often encounter other repetitive tokens like "the", "and", "." or newlines on early training.

Let it train more and there will occasionally pop up new words and phrases it learns, until the "best samples" start to look better.

xaedes avatar Jun 01 '23 12:06 xaedes

There is still a lot of unnecessary memory overhead. In the llama eval function similar overhead is avoided using scratch buffers. Would be good to find a way to do similar with the (automatically generated) backward pass.

Instead of automatically generating the backward pass, I added a function which directly implements the backward pass. This allows to control the memory usage using scratch buffers.

Turns out that all operations for computing gradients are only temporary necessary. Memory can be reused after each layer. But the forward activations for all layers are still necessary to remember until the corresponding backward operation.

This greatly reduces memory usage, especially allowing more layers.

xaedes avatar Jun 01 '23 19:06 xaedes

Very interesting stuff - will be looking into details soon

Instead of automatically generating the backward pass, I added a function which directly implements the backward pass.

Interested in what is the process for creating this function? Is it just an unroll of the generated one or hand-written?

P.s. Probably disable the CI jobs for generating compile warning comments, as these are not relevant for now and make the PR review difficult. Just cherry-pick this: https://github.com/ggerganov/llama.cpp/pull/1642/commits/98c267fc77fe811082f672538fc91bcfc9072d63

ggerganov avatar Jun 01 '23 21:06 ggerganov

Interested in what is the process for creating this function? Is it just an unroll of the generated one or hand-written?

It is hand-written guided by what the automatic backward pass would do. The process is pretty straight forward:

First write out the forward process with a unique name for each tensor in topogical order, e.g. post-order. For each tensor decide if it has a gradient, by looking whether any input tensor to this tensor has a gradient. I usually write the forward process in comments adding a column for gradients of each tensor. (https://github.com/xaedes/llama.cpp/blob/b58d73ca8c5ea1baf42c24db58746b9e763384af/examples/train-text-from-scratch/train-text-from-scratch.cpp#L1611-L1702)

Then do the backward pass: Go over the forward pass operations, but in reverse. Each tensor/operation will add something to the gradients of its input tensors. This will be noted in the gradients column of the comment. If unsure, one can lookup what exactly has to be added in ggml_compute_backward. After all forward pass operations are processed in this manner, we have all necessary gradients noted down. These can then be directly implemented by calling the corresponding ggml operations.

Writing the forward process in one columns and the backward process gradients in another column right next to it, makes it easy to look for tensors that are used more than once, and decide which tensors are only temporary necessary.

Noting the tensor shapes helps to make sure everything is correct.

I used the same process to derive the backward pass implementations for softmax, cross entropy, etc.

xaedes avatar Jun 02 '23 16:06 xaedes

A few more test replications on Linux/PopOs 22.04 - AMD Ryzen 7

I did the same setting above - and all went fine.

# wget https://raw.githubusercontent.com/brunoklein99/deep-learning-notes/master/shakespeare.txt
# ./train-text-from-scratch --vocab-model Wizard-Vicuna-7B-Uncensored.ggmlv3.q5_1.bin --mem-compute 12

Last output was similar, error decay curve seems promising (as noted, generated text seems better as we go).

main: opt->params.adam.sched 0.84000
Example 7, opt iter 100
error_before_opt: 61.881432
error_after_opt:  57.501781
used_mem_before_opt: 842559472 bytes
used_mem_after_opt:  843328080 bytes
Generating 1024 tokens.
 love, thou my love receivest,
I cannot blame thee, for my love thou usest,
But yet be blamed, if thou thy self deceivest
By wilful taste of what thy self refusest.
I do forgive thy robbery gentle thief
Although thou steal thee all my poverty:
And yet love knows it is a greater grief
To bear greater wrong, than hate's known injury.
Lascivious grace, in whom all ill well---

I will check with diff data and iterations later.

BTW Thanks for providing me the super-hero feeling of training on my laptop! :rocket: :superhero: :100:

augustoqm avatar Jun 04 '23 02:06 augustoqm

I will be looking into merging this PR next

ggerganov avatar Jun 06 '23 20:06 ggerganov

Today I noticed the following usage of padding to store the offset:

https://github.com/ggerganov/llama.cpp/blob/2d7bf110edd8c49209401a16132052cba706ffd0/ggml.c#L5896-L5898

Is it possible to replace these usages with opt tensor, as demonstrated here:

https://github.com/ggerganov/llama.cpp/blob/2d7bf110edd8c49209401a16132052cba706ffd0/ggml.c#L5882-L5895

ggerganov avatar Jun 06 '23 20:06 ggerganov

Today I noticed the following usage of padding to store the offset:

Ohh, this comes from my first implementations of the backward passes. Totally forgot to make this right. Afaik there is another one in ggml_permute, where I store the axes in the paddings for backward pass.
Will push a change soon.

xaedes avatar Jun 06 '23 22:06 xaedes

Is it possible to replace these usages with opt tensor, as demonstrated here

Done.

xaedes avatar Jun 08 '23 00:06 xaedes

Amazing! Really looking towards the merge :+1:

niansa avatar Jun 11 '23 00:06 niansa

I tried training a model by running the following command 2 consecutive times:

./bin/train-text-from-scratch --vocab-model ../models/ggml-vocab.bin --mem-compute 12

After the first run, the model was generating something that resembled text, but after the second run it started outputting just , or spaces.

Running with thread sanitizer enabled, there seems to be a data race in ggml_compute_forward_flash_attn_back():

$ ▶ ./bin/train-text-from-scratch --vocab-model ../models/ggml-vocab.bin --mem-compute 12

train-text-from-scratch(2367,0x1f6175e00) malloc: nano zone abandoned due to inability to reserve vm space.
llama.cpp: loading model from ../models/ggml-vocab.bin
llama_model_load_internal: format     = ggjt v1 (pre #1405)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 1 (mostly F16)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
main: tokenize training data
main: number of training tokens: 27584
print_params: n_vocab: 32000
print_params: n_ctx:   128
print_params: n_embd:  256
print_params: n_mult:  256
print_params: n_head:  8
print_params: n_ff:    768
print_params: n_layer: 16
print_params: n_rot:   32
main: number of unique tokens: 3070
main: init model
load_checkpoint: Training iterations: 0.
load_checkpoint: Training samples:    0.
load_checkpoint: Training tokens:     0.
main: opt iter 0
used_mem model+cache: 244461568 bytes
main: begin training
main: opt->params.adam.sched 0.00000
==================
WARNING: ThreadSanitizer: data race (pid=2367)
  Read of size 8 at 0x000c031d0000 by thread T11:
    #0 ggml_compute_forward_flash_attn_back ggml.c:13819 (train-text-from-scratch:arm64+0x10007d44c) (BuildId: d0b59247a937336f9b4e57c596e5a3a832000000200000000100000000000d00)
    #1 ggml_compute_forward ggml.c:14419 (train-text-from-scratch:arm64+0x1000509e4) (BuildId: d0b59247a937336f9b4e57c596e5a3a832000000200000000100000000000d00)
    #2 ggml_graph_compute_thread ggml.c:15519 (train-text-from-scratch:arm64+0x10004fff8) (BuildId: d0b59247a937336f9b4e57c596e5a3a832000000200000000100000000000d00)

  Previous write of size 8 at 0x000c031d0000 by thread T12:
    #0 ggml_compute_forward_flash_attn_back ggml.c:13819 (train-text-from-scratch:arm64+0x10007d468) (BuildId: d0b59247a937336f9b4e57c596e5a3a832000000200000000100000000000d00)
    #1 ggml_compute_forward ggml.c:14419 (train-text-from-scratch:arm64+0x1000509e4) (BuildId: d0b59247a937336f9b4e57c596e5a3a832000000200000000100000000000d00)
    #2 ggml_graph_compute_thread ggml.c:15519 (train-text-from-scratch:arm64+0x10004fff8) (BuildId: d0b59247a937336f9b4e57c596e5a3a832000000200000000100000000000d00)

  Location is heap block of size 8589934592 at 0x000c00010000 allocated by main thread:
    #0 operator new[](unsigned long) <null>:65021472 (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x8ffbc) (BuildId: 981013a59ee23029b2ed90b76951327532000000200000000100000000000b00)
    #1 main train-text-from-scratch.cpp:3104 (train-text-from-scratch:arm64+0x100011a50) (BuildId: d0b59247a937336f9b4e57c596e5a3a832000000200000000100000000000d00)

  Thread T11 (tid=12825379, running) created by main thread at:
    #0 pthread_create <null>:65021472 (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x2fd88) (BuildId: 981013a59ee23029b2ed90b76951327532000000200000000100000000000b00)
    #1 ggml_graph_compute ggml.c:15563 (train-text-from-scratch:arm64+0x10004eeb0) (BuildId: d0b59247a937336f9b4e57c596e5a3a832000000200000000100000000000d00)
    #2 ggml_opt_resume_g ggml.c:17529 (train-text-from-scratch:arm64+0x1000552d4) (BuildId: d0b59247a937336f9b4e57c596e5a3a832000000200000000100000000000d00)
    #3 main train-text-from-scratch.cpp:3198 (train-text-from-scratch:arm64+0x1000120f8) (BuildId: d0b59247a937336f9b4e57c596e5a3a832000000200000000100000000000d00)

  Thread T12 (tid=12825380, running) created by main thread at:
    #0 pthread_create <null>:65021472 (libclang_rt.tsan_osx_dynamic.dylib:arm64e+0x2fd88) (BuildId: 981013a59ee23029b2ed90b76951327532000000200000000100000000000b00)
    #1 ggml_graph_compute ggml.c:15563 (train-text-from-scratch:arm64+0x10004eeb0) (BuildId: d0b59247a937336f9b4e57c596e5a3a832000000200000000100000000000d00)
    #2 ggml_opt_resume_g ggml.c:17529 (train-text-from-scratch:arm64+0x1000552d4) (BuildId: d0b59247a937336f9b4e57c596e5a3a832000000200000000100000000000d00)
    #3 main train-text-from-scratch.cpp:3198 (train-text-from-scratch:arm64+0x1000120f8) (BuildId: d0b59247a937336f9b4e57c596e5a3a832000000200000000100000000000d00)

SUMMARY: ThreadSanitizer: data race ggml.c:13819 in ggml_compute_forward_flash_attn_back
==================

Here is full log of the 2 training runs + using main after that:

$ ▶ make -j && ./bin/train-text-from-scratch --vocab-model ../models/ggml-vocab.bin --mem-compute 12
Consolidate compiler generated dependencies of target ggml
[  2%] Generating build details from Git
-- Found Git: /opt/homebrew/bin/git (found version "2.39.0") 
[  5%] Built target ggml
[  5%] Built target BUILD_INFO
Consolidate compiler generated dependencies of target llama
[ 11%] Built target llama
Consolidate compiler generated dependencies of target quantize
Consolidate compiler generated dependencies of target test-tokenizer-0
Consolidate compiler generated dependencies of target quantize-stats
Consolidate compiler generated dependencies of target test-sampling
Consolidate compiler generated dependencies of target common
[ 22%] Built target test-quantize-fns
[ 22%] Built target test-quantize-perf
[ 31%] Built target test-tokenizer-0
[ 34%] Built target quantize
[ 40%] Built target test-sampling
[ 45%] Built target quantize-stats
[ 48%] Built target common
Consolidate compiler generated dependencies of target perplexity
Consolidate compiler generated dependencies of target main
Consolidate compiler generated dependencies of target embedding
Consolidate compiler generated dependencies of target train-text-from-scratch
Consolidate compiler generated dependencies of target save-load-state
[ 54%] Built target benchmark
[ 62%] Built target baby-llama
[ 65%] Built target q8dot
[ 71%] Built target vdot
[ 77%] Built target train-text-from-scratch
[ 82%] Built target perplexity
[ 88%] Built target embedding
[ 94%] Built target main
[100%] Built target save-load-state
llama.cpp: loading model from ../models/ggml-vocab.bin
llama_model_load_internal: format     = ggjt v1 (pre #1405)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 1 (mostly F16)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
main: tokenize training data
main: number of training tokens: 27584
print_params: n_vocab: 32000
print_params: n_ctx:   128
print_params: n_embd:  256
print_params: n_mult:  256
print_params: n_head:  8
print_params: n_ff:    768
print_params: n_layer: 16
print_params: n_rot:   32
main: number of unique tokens: 3070
main: init model
load_checkpoint: Training iterations: 0.
load_checkpoint: Training samples:    0.
load_checkpoint: Training tokens:     0.
main: opt iter 0
used_mem model+cache: 244461568 bytes
main: begin training
main: opt->params.adam.sched 0.00000
Example 0, opt iter 1
error_before_opt: 62.237373
error_after_opt:  62.237373
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
Example:
---
 several plot,
Which my heart knows the wide world's common place?
Or mine eyes seeing this, say this is not
To put fair truth upon so foul a face?
In things right true my heart and eyes have erred,
And to this false plague are they now transferred.



When my love swears that she is made of truth,
I do believe her though I know she lies,
That she might think me some untutored youth,
Unlearned in the world's false subtleties.
Thus vainly thinking that she
--
 keep her treasure!
Her audit (though delayed) answered must be,
And her quietus is to render thee.



In the old age black was not counted fair,
Or if it were it bore not beauty's name:
But now is black beauty's successive heir,
And beauty slandered with a bastard shame,
For since each hand hath put on nature's power,  
Fairing the foul with art's false borrowed face,
Sweet beauty hath no name no holy bower,
But is profaned
--
ise the deep vermilion in the rose,
They were but sweet, but figures of delight:
Drawn after you, you pattern of all those.
Yet seemed it winter still, and you away,
As with your shadow I with these did play.



The forward violet thus did I chide,
Sweet thief, whence didst thou steal thy sweet that smells,
If not from my love's breath? The purple pride
Which on thy soft check for complexion dwells,
In my love's veins thou hast too
--
 memory.
Then the conceit of this inconstant stay,
Sets you most rich in youth before my sight,
Where wasteful time debateth with decay
To change your day of youth to sullied night,
And all in war with Time for love of you,
As he takes from you, I engraft you new.
  
But wherefore do not you a mightier way
Make war upon this bloody tyrant Time?
And fortify your self in your decay
With means more blessed than my barren rhyme?
Now stand you on the top of
--

No it was builded far from accident,
It suffers not in smiling pomp, nor falls
Under the blow of thralled discontent,
Whereto th' inviting time our fashion calls:
It fears not policy that heretic,
Which works on leases of short-numbered hours,
But all alone stands hugely politic,  
That it nor grows with heat, nor drowns with showers.
To this I witness call the fools of time,
Which die for goodness, who have lived for crime.



--
 to time your own dear-purchased right,
That I have hoisted sail to all the winds
Which should transport me farthest from your sight.
Book both my wilfulness and errors down,
And on just proof surmise, accumulate,
Bring me within the level of your frown,
But shoot not at me in your wakened hate:
Since my appeal says I did strive to prove
The constancy and virtue of your love.



Like as to make our appetite more keen
With eager compounds we our pal
--
art-complexioned night,
When sparkling stars twire not thou gild'st the even.
But day doth daily draw my sorrows longer,
And night doth nightly make grief's length seem stronger

When in disgrace with Fortune and men's eyes,
I all alone beweep my outcast state,
And trouble deaf heaven with my bootless cries,
And look upon my self and curse my fate,
Wishing me like to one more rich in hope,
Featured like him, like him with friends possessed,

--
 be, or your affairs suppose,
But like a sad slave stay and think of nought
Save where you are, how happy you make those.
So true a fool is love, that in your will,
(Though you do any thing) he thinks no ill.

That god forbid, that made me first your slave,
I should in thought control your times of pleasure,
Or at your hand th' account of hours to crave,
Being your vassal bound to stay your leisure.
O let me suffer (being at your beck)
Th
--

---
samples after optimization:
---
topic aj aj ajількиількиількиількиількиількиількиількиількиількиількиількиількиількиількиількиількиількиількиількиількиількиількиitoreількиількиількиitoreількиількиitoreількиількиitoreitoreitoreitoreількиitoreitoreitoreitoreitoreitoreitoreitoreitoreitoreitoreitoreitoreitoreitoreitoreitoreitore故itoreitoreitoreitoreitoreitoreitoreitoreitoreitoreitoreitore☺☺itoreitoreitoreitoreitoreitoreitoreitore☺itore☺itore☺itoreitoreitoreitore☺☺itore☺☺itoreitore☺☺itoreitore☺☺☺☺☺☺☺☺☺☺☺故itore☺☺☺itore☺☺☺☺☺☺☺☺
--
topictopictopictopicheet aj &= &= &= &= &= &= &= &= &= &= &= &= &= &=heet &= &= &= &= &= &= &= &= &= &= &= &= &= &= &= &= &= &= &= &= &= &= &= &= &= &= &= prz &= prz &= &= software prz prz &= prz prz prz prz prz prz &= prz prz &= &= prz prz prz prz software prz prz prz prz software prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz prz przagy prz prz prz software prz prz prz prz prz prz prz prz prz prz prz
--
topic aj tweede ajheetшогошогоheet Tochter Tochterшого dro Tochterheetforce Lukeschemasforceschemasschemasschemasschemasforceschemasschemasschemasforceschemasforceschemasschemasschemasschemasschemasforceschemasforceforceschemasschemasschemasschemasschemasforceschemasschemasforceschemas struggleyll indicschemas struggleschemasschemasschemasyllyllyllyllyllyll lung indicheet indicyllyllyllyllyllyllyll lung lungyllyllyll lungyll indicyllyllyllyllyllyllyllyllyllyllyllyllyllyllyll lungyll indicyllyllyllyllyllyllyll indicyllyllyllyllyllyllyllyllyllyllyllyllyllyllyllyllyllyllyllyllyll
--
topictopictopictopic ries ries monument monument monument tweedeforce%)%)%)%) április%)%)ű%)Method%)%)%)%)MethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethodMethod
--
topic aj aj coordinate coordinate coordinate exploollarollarollar &=ollar janu janu Kasollar Kasollar janu Kas Kas Kas Kas Kasollarollar Kas Kas Kas Kas Kas Kas Kas Kasollar Kas Kas Kas Kas Kas Kas Kas Kas Kas Kas janu Kasollar Kas dro Kas dro Kas dro Kas znajdu dro dro Sid znajdu dro znajdu znajdu Kas znajdu znajdu znajdu znajdu znajdu znajdu znajdu dro znajduƒ znajdu znajdu znajdu Sid znajdu znajdu Kas znajdu znajdu dro znajdu znajdu znajdu znajdu znajdu znajdu znajdu znajdu znajdu znajdu znajduƒ znajdu znajdu znajduƒ znajdu znajdu znajduƒ znajdu znajdu znajdu znajdu znajdu znajdu znajduƒThere znajdu znajduThere znajdu znajdu znajdu flagƒ znajduThereƒ znajduThereThereThere
--
topictopic aj aj aj aj aj aj aj aj Value ajollarEncoding ries riesollarollarollarollarollarollarollarollarollarollarollarollarollar riesollarollarollarollarollarollarollarollarollarollarollarollarollarollarollarollarollarollarollarollarollarollarollarEncoding riesollarollarollar riesollarollarollarollar ries riesollarollarollarollar riesollarollarollarollar riesollar ries riesollarollarollar riesollarollarollarollarollar riesollarollar riesollarollarollarollarollarollarollar siguiente riesollarollarollarollar riesollarollarollarollarollarollar siguiente siguiente%) ries ries riesollar riesECTollar%)ollar siguiente%)%) ries siguiente
--
topic aj ajtopictopictopictopictopictopictopictopictopictopictopictopicEncodingEncodingEncoding riesEncoding riesEncoding ries riesEncoding ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries ries dro ries ries ries dro dro ries ries dro rieschrome dro ries ries dro ries dro dro dro ries ries rieschrome dro dro ries ries ries dro dro dro ries ries dro rieschrome dro dro ries dro drochrome dro riesECT dro drochrome dro ries dro ries riesECT ries dro riesECT dro dro dro droECT ries dro dro ries droECT dro dro dro dro
--
topictopictopic &= ries ries ries &= ries &= ries ax ries ries ax ries ries ries ax ries ries ries%) ries ries%) ries ries Eug%)%) ries%) ries ries%)obiernoobierno%)%)%)obierno%)obierno%)obierno%)obierno%)obiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobierno%)%)%)obierno%)obiernoobiernoobiernoobierno%)obiernoриторobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobiernoobierno
--

---
main: opt->params.adam.sched 0.01000
Example 1, opt iter 3
error_before_opt: 62.231613
error_after_opt:  62.230423
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
main: opt->params.adam.sched 0.03000
Example 2, opt iter 19
error_before_opt: 62.245007
error_after_opt:  62.209435
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
Example:
---
,
And under thee their poesy disperse.
Thine eyes, that taught the dumb on high to sing,
And heavy ignorance aloft to fly,
Have added feathers to the learned's wing,
And given grace a double majesty.
Yet be most proud of that which I compile,
Whose influence is thine, and born of thee,
In others' works thou dost but mend the style,
And arts with thy sweet graces graced be.
But thou art all my art, and dost advance
As high as learning
--
 his memory:
But thou contracted to thine own bright eyes,
Feed'st thy light's flame with self-substantial fuel,
Making a famine where abundance lies,
Thy self thy foe, to thy sweet self too cruel:
Thou that art now the world's fresh ornament,
And only herald to the gaudy spring,
Within thine own bud buriest thy content,
And tender churl mak'st waste in niggarding:
Pity the world, or else this glutton be,
To
--
,
For thy records, and what we see doth lie,
Made more or less by thy continual haste:
This I do vow and this shall ever be,
I will be true despite thy scythe and thee.



If my dear love were but the child of state,
It might for Fortune's bastard be unfathered,
As subject to time's love or to time's hate,
Weeds among weeds, or flowers with flowers gathered.
No it was builded far from accident,
It suffers not in sm
--
-day by feeding is allayed,
To-morrow sharpened in his former might.
So love be thou, although to-day thou fill
Thy hungry eyes, even till they wink with fulness,
To-morrow see again, and do not kill
The spirit of love, with a perpetual dulness:
Let this sad interim like the ocean be
Which parts the shore, where two contracted new,
Come daily to the banks, that when they see:
Return of love, more blest may be the view.
Or call
--
That to my use it might unused stay
From hands of falsehood, in sure wards of trust!
But thou, to whom my jewels trifles are,
Most worthy comfort, now my greatest grief,
Thou best of dearest, and mine only care,
Art left the prey of every vulgar thief.  
Thee have I not locked up in any chest,
Save where thou art not, though I feel thou art,
Within the gentle closure of my breast,
From whence at pleasure thou mayst come and part,
And
--

Then can I grieve at grievances foregone,
And heavily from woe to woe tell o'er
The sad account of fore-bemoaned moan,
Which I new pay as if not paid before.
But if the while I think on thee (dear friend)
All losses are restored, and sorrows end.
  
Thy bosom is endeared with all hearts,
Which I by lacking have supposed dead,
And there reigns love and all love's loving parts,
And all those friends which I thought buried
--

Each changing place with that which goes before,
In sequent toil all forwards do contend.
Nativity once in the main of light,
Crawls to maturity, wherewith being crowned,
Crooked eclipses 'gainst his glory fight,
And Time that gave, doth now his gift confound.
Time doth transfix the flourish set on youth,
And delves the parallels in beauty's brow,
Feeds on the rarities of nature's truth,
And nothing stands but for his
--
 my heart,
My body is the frame wherein 'tis held,
And perspective it is best painter's art.
For through the painter must you see his skill,
To find where your true image pictured lies,
Which in my bosom's shop is hanging still,
That hath his windows glazed with thine eyes:
Now see what good turns eyes for eyes have done,
Mine eyes have drawn thy shape, and thine for me
Are windows to my breast, where-through the sun
Delights to peep, to gaze therein on
--

---
samples after optimization:
---
Returnssss givenameame hisame hisameameameameameameameameameameameameameameameameameameameameameameameameameameameameameameameame thatameameameameameameameameameameameameameameameameameameameameameameameameameame hisameameameameameameameameameame thatameame thatame thatame thatameameameameameameameameame that thatameameameame thatameameame thatameameame that thatameameameameameame thatameame that thatameameame
--
Return eyes eyes false false false false false false false false false false false false false false false false false false false false false those falseI false riesIIIIIIII riesI riesIIIIIIIIIII riesIIIIII ries riesII riesIII riesIII riesIIII riesI riesII ries riesII riesIII riesI riesIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
--
Returnssameame those those his those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those those thoseIf those those those those thoseIf those those those those those those those those those those those those those those those those those those those those those those those thoseIf those those those thoseIf those thoseIf those those
--
ReturnReturnsssssssssSoSoSoameameameameSoameSoameSoameameameameameameameameameameameameame politameameameameameameameSoameesame politameameame polit polit polit politame politame polit politWh polit polit politWhWhWhWhWh politesWhWhWhWhWhWhWh politWhWhWhWhWhWh politWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWhWh
--
Return'''''''''''''''''' living livingSoSoSoSo livingSoSoSoSo livingSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSoSo
--
Return''' those those those those those those those those thoses those those those those those those thoses those those those those those those those those those those those those those those those those those those those those those those those those those those those those those thoseTime those those those those those those those thoseTimeTimeTime thoseTimeTimeTimeTimeTimeTimeTimeTimeTimeTime thoseTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTimeTime
--
Return'''sIsss hisss hissss his hiss hiss hiss his m ms m his m m his m m m m proud m m m m m m mame m m m m m m m mame m m m m m m m mame m m mame m mame mameameame m mame mame m mameameame mame m m m mameameame mameameameameameameameameameameameameameameameameameameameameameameameameame mameameameameameameameame
--
Return' eyes eyes eyes eyes eyes eyesI eyesIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII thoseI done thoseI done those done doneI done done done those done those done doneI those done done done done done those done those done done done done done done done done done done done those done done done done done done done done done done done done done those done done done done done done done done done done done those done done done done done done done done done done done done
--

---
main: opt->params.adam.sched 0.19000
Example 3, opt iter 34
error_before_opt: 62.213215
error_after_opt:  62.114693
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
main: opt->params.adam.sched 0.34000
Example 4, opt iter 35
error_before_opt: 62.154037
error_after_opt:  62.153812
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
Example:
---
Thy youth's proud livery so gazed on now,
Will be a tattered weed of small worth held:  
Then being asked, where all thy beauty lies,
Where all the treasure of thy lusty days;
To say within thine own deep sunken eyes,
Were an all-eating shame, and thriftless praise.
How much more praise deserved thy beauty's use,
If thou couldst answer 'This fair child of mine
Shall sum my count, and make my old excuse'
Proving his beauty
--
 graces and your gifts to tell.
And more, much more than in my verse can sit,
Your own glass shows you, when you look in it.



To me fair friend you never can be old,
For as you were when first your eye I eyed,
Such seems your beauty still: three winters cold,
Have from the forests shook three summers' pride,
Three beauteous springs to yellow autumn turned,
In process of the seasons have I seen,
Three April perfumes in three hot Junes burned,
Since
--
ed accidents
Creep in 'twixt vows, and change decrees of kings,
Tan sacred beauty, blunt the sharp'st intents,
Divert strong minds to the course of alt'ring things:
Alas why fearing of time's tyranny,
Might I not then say 'Now I love you best,'
When I was certain o'er incertainty,
Crowning the present, doubting of the rest?
Love is a babe, then might I not say so
To give full growth to that which
--
inary sight
Presents thy shadow to my sightless view,
Which like a jewel (hung in ghastly night)
Makes black night beauteous, and her old face new.
Lo thus by day my limbs, by night my mind,
For thee, and for my self, no quiet find.

How can I then return in happy plight
That am debarred the benefit of rest?
When day's oppression is not eased by night,
But day by night and night by day oppressed.
And each (though enemies
--
 now nature bankrupt is,
Beggared of blood to blush through lively veins,
For she hath no exchequer now but his,
And proud of many, lives upon his gains?
O him she stores, to show what wealth she had,
In days long since, before these last so bad.

Thus is his cheek the map of days outworn,
When beauty lived and died as flowers do now,
Before these bastard signs of fair were born,
Or durst inhabit on a living brow:
Before the golden tresses of
--
, and I straight will halt:
Against thy reasons making no defence.
Thou canst not (love) disgrace me half so ill,
To set a form upon desired change,
As I'll my self disgrace, knowing thy will,
I will acquaintance strangle and look strange:
Be absent from thy walks and in my tongue,
Thy sweet beloved name no more shall dwell,
Lest I (too much profane) should do it wronk:
And haply of our old acquaintance tell.  
For thee
--
ing new hate after new love bearing:
But why of two oaths' breach do I accuse thee,  
When I break twenty? I am perjured most,
For all my vows are oaths but to misuse thee:
And all my honest faith in thee is lost.
For I have sworn deep oaths of thy deep kindness:
Oaths of thy love, thy truth, thy constancy,
And to enlighten thee gave eyes to blindness,
Or made them swear against the thing they see.
For I have sw
--
er whom thy fingers walk with gentle gait,
Making dead wood more blest than living lips,
Since saucy jacks so happy are in this,
Give them thy fingers, me thy lips to kiss.



Th' expense of spirit in a waste of shame
Is lust in action, and till action, lust
Is perjured, murd'rous, bloody full of blame,
Savage, extreme, rude, cruel, not to trust,
Enjoyed no sooner but despised straight,
Past reason h
--

---
samples after optimization:
---
 learn learn learn learn learn learn learn learn learn learn learn learn learn their learn learn their their their their their their their their their their o o o o o their o their o o o o o o o o o o o o o o o o o o o o o o o o o o oph o o o o o o o o o o o o o o oph o o o o o o o o oph oph o o o oph o o ophphph oph ophphph o ophph oph o ophphphphphphphphphphphph o
--
 learn learn learn learn learn learnMyMyMyMy theeMy thee thee f f f f f f f f f f f f f f never f f f f f f f f f f f f f f f f f dist f f dist dist dist dist dist dist dist dist dist f dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist dist
--
 learn learn learn learn learn learn learn learn learnon learnononon oononononononon oonon oonon oonononononononon disonon oononononon o o oon o o o oon o lim o o obe obebebeonbe obebebebebebebe limbebe o limbe o obebebebe limbebe limbebe limbebe limbe limbe lim o obe lim lim limbebebebe lim limbebebebebebe lim lim limbebe lim lim limbe
--
 learnances learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn o learn learn o o o o never o o o o o never o o o o o o o o o o o never o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
--
 learn learn learn learnoniousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousiousious proceedious proceediousious limious lim lim lim limious limph lim lim limilledphphphphphphphph limphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphphph
--
 learn learn learn learn learn learn learn learn learn learn learn learn learn learn never never never never never never never never never never never never never never never never never never never never never never never never never lim never never lim never never never never never never never never never never never never never never never never never never never never lim lim never never never lim lim lim never never never lim lim lim never never lim never lim never lim lim never never dist lim lim lim lim lim lim lim lim never lim lim lim lim lim limlyinglying lim lim lim lim lim lim lim lim lim lim dist lim lim dist lim lim lim limlyinglying lim distlying
--
 learn learn learn learn learn learn learn learn learn learn learn learn learnc inj learn learn inj learn learn never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never neveruthuth never never limuthuthuth neveruth limuthuth never neveruthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuthuth
--
 learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn learn never their never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never never face face never
--

---
main: opt->params.adam.sched 0.35000
Example 5, opt iter 48
error_before_opt: 62.186760
error_after_opt:  61.973248
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
main: opt->params.adam.sched 0.48000
Example 6, opt iter 64
error_before_opt: 62.000847
error_after_opt:  61.660248
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
Example:
---
 thou wouldst use the strength of all thy state!
But do not so, I love thee in such sort,
As thou being mine, mine is thy good report.



How like a winter hath my absence been
From thee, the pleasure of the fleeting year!
What freezings have I felt, what dark days seen!
What old December's bareness everywhere!  
And yet this time removed was summer's time,
The teeming autumn big with rich increase,
Bearing the wanton burden of the prime,
Like widowed w
--

How can I then return in happy plight
That am debarred the benefit of rest?
When day's oppression is not eased by night,
But day by night and night by day oppressed.
And each (though enemies to either's reign)
Do in consent shake hands to torture me,
The one by toil, the other to complain
How far I toil, still farther off from thee.  
I tell the day to please him thou art bright,
And dost him grace when clouds do blot the heaven:
So flatter
--
 something sweet to thee.
Make but my name thy love, and love that still,
And then thou lov'st me for my name is Will.



Thou blind fool Love, what dost thou to mine eyes,
That they behold and see not what they see?
They know what beauty is, see where it lies,
Yet what the best is, take the worst to be.
If eyes corrupt by over-partial looks,  
Be anchored in the bay where all men ride,
Why of eyes' falsehood hast thou forged hooks,
W
--
And every fair with his fair doth rehearse,
Making a couplement of proud compare
With sun and moon, with earth and sea's rich gems:
With April's first-born flowers and all things rare,
That heaven's air in this huge rondure hems.
O let me true in love but truly write,
And then believe me, my love is as fair,
As any mother's child, though not so bright
As those gold candles fixed in heaven's air:
Let them say more that like of hearsay well,

--
 second self that seals up all in rest.  
In me thou seest the glowing of such fire,
That on the ashes of his youth doth lie,
As the death-bed, whereon it must expire,
Consumed with that which it was nourished by.
This thou perceiv'st, which makes thy love more strong,
To love that well, which thou must leave ere long.



But be contented when that fell arrest,
Without all bail shall carry me away,
My life hath in this line some interest,
--
's furrows I behold,
Then look I death my days should expiate.  
For all that beauty that doth cover thee,
Is but the seemly raiment of my heart,
Which in thy breast doth live, as thine in me,
How can I then be elder than thou art?
O therefore love be of thyself so wary,
As I not for my self, but for thee will,
Bearing thy heart which I will keep so chary
As tender nurse her babe from faring ill.
Presume not on
--
ed, delivered from thy brain,
To take a new acquaintance of thy mind.
These offices, so oft as thou wilt look,
Shall profit thee, and much enrich thy book.



So oft have I invoked thee for my muse,
And found such fair assistance in my verse,
As every alien pen hath got my use,
And under thee their poesy disperse.
Thine eyes, that taught the dumb on high to sing,
And heavy ignorance aloft to fly,
Have added feathers to the learned's
--
 cure I am, now reason is past care,
And frantic-mad with evermore unrest,
My thoughts and my discourse as mad men's are,
At random from the truth vainly expressed.
For I have sworn thee fair, and thought thee bright,
Who art as black as hell, as dark as night.

O me! what eyes hath love put in my head,
Which have no correspondence with true sight,
Or if they have, where is my judgment fled,
That censures falsely what they see aright?
If that
--

---
samples after optimization:
---
 c c am am fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fairss fairssssss fairssssssssssssss asss ass as a as as a a as a a a a a a as a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
--
 c c am as as fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fairs fair fair fair fair in in in fairs ins in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in
--
 c c am am am as as as as as as as as as as as as as as as as as as as as as as as as as as fair as as as as as as fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair
--
 c c would am fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
--
 c c c would am am am am am am am am am am am am am am am am fair amssssssssssssssssssssssssssssssssssssssssssssssssss  sss  ssss    s  s    s  s                          s                                              
--
 c c would would fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair
--
 c c would as fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair fair a fair fair fair a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
--
 c c am am am am am am am am am am am am as as sweet as as as fairss fairsssssssssssssssssssssssss a as a a as a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
--

---
main: opt->params.adam.sched 0.64000
Example 7, opt iter 80
error_before_opt: 62.058563
error_after_opt:  60.993385
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
Generating 1024 tokens.
 is not eased by night,
But day by night and night by day oppressed.
And each (though enemies to either's reign)
Do in consent shake hands to torture me,
The one by toil, the other to complain
How far I toil, still farther off from thee.  
I tell the day to please him thou art bright,
And dost him grace when clouds do blot the heaven:
So flatter I the swart-complexioned night,
When---
 in what what me what what what what is therethough in w they what eyes sweet what modern I must new what flowers seemMy inTh what   loves when name and where in which bar whatce what what in me or what thanoth youth from sickoth what nevering what worth what what society child what light know what what mo added so well meles what what seede what what blood whatD heaven absenceay I g is is what true far what' whatIfear sight when whately thoughtichBut the what whatakingakingY what anch what tell what thing thatWithAnd this am in what from whatThey look what strong what what pale fool what fingers whatThat what show whatfound what any whatar meAs what hopes whatWhen what thats what should what I sweet what rare what et what opp whatinary what have deb but by what whating what whatbundle purs tongueThe mo what happy what what ocean what new have what come glends star dort what modern altoaking whatrieb what whatологи whatP they what what afford. thought come truly what star practice meO what youthins I that what whatarsening, what rondstaw modern it ' they whatbed what wh wealth what heaven well truly what what flowers quick look the chero pursycigeq what whatesség what take what modern(% but what se підve whatButMTheyInence what whataking learning having couarse beh return re pursThey Love Mira foolarsearsearseOT know what worst whatPres what on hell inAs: isage more whatWh b. what what wThey proudThey what rudThen what pot what reasons what fair barメ what bearW what methods methods瀬 glory what whatren Су whatiment hell what what those m art what war what what ste with g what earth things whatLet what astronom whatAg what still modern Jimmy what che frat fair   truly purs Teams whatlow Brand starofdebug star new what speakingfeantom so what writepass what:pass toI but in thyWhy what a' star whatepedsakingarsearse what trulyangle gaz me rond rondarserierearsearse发 modern age what must heaven modern органіThat what what where what content what与 have what what while can though what tears what what lips allThen whatth what hathsWhat what strong whatamen what by bar gra inIn so what what content what happy so is truly say
From what do' whatals there me and whatIs they fair, whenO whatren will thatAsaking what proud whatO what ext modern Schl what truly ros that what richted what lives whatWhy see canMy by areBut what)IothMy is as love in in what elder what two trulyarse dí purs whatrepo what cou eranbed modernusage than rond poi rondaking Ps what prime whatth what dost methods geom modern cheusername your believe what what- heaven deb sunt what modern star quasi what reieron what day sc opp what what what huge what тра far what what return whatAh what fairer mightThat what what aside datab throughSo what when whatWhen fair
 mine do in breastTheyarsearseW rond WrestlingakingThey Value what needsTheyaking crownorn shouldThey Love worstrowser Italysourceforge Отеbedih what gentle what sin what making what halt knowren three me time bar youth   my whatold ofiv meil is m whating darks ever thatoe have,
 though what he black by- where in   whatThe what whaty modern asWith what purs roku modernDel modernren think what what Houlowarsethreads methodsThey Att modern iter ant reform asideच upon quick quickofwehr starfedelferen parlamentWith whatust what expressed do ison is it me? what whataking Fort account as
 heart what to eyes have in me by what that foolBut what is what the what modernThis whatow what год purs what wo what dost? whatbedgers in whatarse iterarter seThey acqu Love child what signs whatainer what lines what dark fair anAnd thys! what whenFor   removed my thee. what what one due that I from and and fars
, with things bank of black is it in whatpt winter what do what modern Connection what purs Außerdem modernakingeder what what description modern us whatfeSpeending grown what methodsThank what quick честь what truly proc whatarse lum what modern Lear modernusly it fe me whatLike the so what variation what che♭mad whatW be whatAnd if thee ever is such ( me what what what what app strange growing methods purs/- modern buried what ext other what thou whatạ supposed what whatren g whatruptTheyarse préc one when what what what rond what cThe what,.  ? that hell
 worth again in accounts though whatWhen it in from my whatbeOr as whatThat ofary what thee what name what be ch no whatB isractionLet must what me whatBeaking modern ekonomhoodarse what pleasure rondse trulychod returnWith what locked vbedThey lo what tender rondBinding rond rondiring first what soizing report lov for, thy ggerganov ▶ Georgis-MBP ▶ ~/development/github/llama.cpp/build-release ▶
 11:38:43 ▶ ⚓ master-7552ac5-77-g6b7487d ▶ ? ▶ 13⎘ ▶ $ ▶ make -j && ./bin/train-text-from-scratch --vocab-model ../models/ggml-vocab.bin --mem-compute 12
[  2%] Generating build details from Git
-- Found Git: /opt/homebrew/bin/git (found version "2.39.0") 
[  5%] Built target ggml
[  5%] Built target BUILD_INFO
[ 11%] Built target llama
[ 28%] Built target test-quantize-fns
[ 40%] Built target test-quantize-perf
[ 37%] Built target test-tokenizer-0
[ 40%] Built target quantize
[ 40%] Built target quantize-stats
[ 45%] Built target test-sampling
[ 48%] Built target common
[ 54%] Built target vdot
[ 68%] Built target q8dot
[ 77%] Built target train-text-from-scratch
[ 77%] Built target save-load-state
[ 77%] Built target benchmark
[ 88%] Built target embedding
[ 88%] Built target baby-llama
[ 94%] Built target main
[100%] Built target perplexity
llama.cpp: loading model from ../models/ggml-vocab.bin
llama_model_load_internal: format     = ggjt v1 (pre #1405)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 512
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 1 (mostly F16)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
main: tokenize training data
main: number of training tokens: 27584
print_params: n_vocab: 32000
print_params: n_ctx:   128
print_params: n_embd:  256
print_params: n_mult:  256
print_params: n_head:  8
print_params: n_ff:    768
print_params: n_layer: 16
print_params: n_rot:   32
main: number of unique tokens: 3070
main: init model
load_checkpoint: Loading model from 'checkpoint.bin'.
print_params: n_vocab: 32000
print_params: n_ctx:   128
print_params: n_embd:  256
print_params: n_mult:  256
print_params: n_head:  8
print_params: n_ff:    768
print_params: n_layer: 16
print_params: n_rot:   32
load_checkpoint: Training iterations: 80.
load_checkpoint: Training samples:    64.
load_checkpoint: Training tokens:     8192.
main: opt iter 80
used_mem model+cache: 1085133568 bytes
main: begin training
main: opt->params.adam.sched 0.80000
Example 0, opt iter 96
error_before_opt: 60.891148
error_after_opt:  58.107105
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
Example:
---
 mourn for me when I am dead,
Than you shall hear the surly sullen bell
Give warning to the world that I am fled
From this vile world with vilest worms to dwell:
Nay if you read this line, remember not,
The hand that writ it, for I love you so,
That I in your sweet thoughts would be forgot,
If thinking on me then should make you woe.
O if (I say) you look upon this verse,
When I (perhaps) compounded am with clay,
Do not so much
--
 taught it thus anew to greet:
'I hate' she altered with an end,
That followed it as gentle day,
Doth follow night who like a fiend
From heaven to hell is flown away.
'I hate', from hate away she threw,
And saved my life saying 'not you'.

Poor soul the centre of my sinful earth,  
My sinful earth these rebel powers array,
Why dost thou pine within and suffer dearth
Painting thy outward walls so costly gay?
Why so large cost having so short
--
 excuse will my poor beast then find,
When swift extremity can seem but slow?
Then should I spur though mounted on the wind,
In winged speed no motion shall I know,
Then can no horse with my desire keep pace,
Therefore desire (of perfect'st love being made)
Shall neigh (no dull flesh) in his fiery race,
But love, for love, thus shall excuse my jade,
Since from thee going, he went wilful-slow,
Towards thee I'll run, and give him leave to go
--
,
As testy sick men when their deaths be near,
No news but health from their physicians know.
For if I should despair I should grow mad,
And in my madness might speak ill of thee,
Now this ill-wresting world is grown so bad,
Mad slanderers by mad ears believed be.
That I may not be so, nor thou belied,
Bear thine eyes straight, though thy proud heart go wide.



In faith I do not love thee with mine eyes,  
For they in thee a thousand errors
--
 sparkling stars twire not thou gild'st the even.
But day doth daily draw my sorrows longer,
And night doth nightly make grief's length seem stronger

When in disgrace with Fortune and men's eyes,
I all alone beweep my outcast state,
And trouble deaf heaven with my bootless cries,
And look upon my self and curse my fate,
Wishing me like to one more rich in hope,
Featured like him, like him with friends possessed,
Desiring this man's art,
--
To him that bears the strong offence's cross.  
Ah but those tears are pearl which thy love sheds,
And they are rich, and ransom all ill deeds.



No more be grieved at that which thou hast done,
Roses have thorns, and silver fountains mud,
Clouds and eclipses stain both moon and sun,
And loathsome canker lives in sweetest bud.
All men make faults, and even I in this,
Authorizing thy trespass with compare,
My
--
 of him I'll live in this poor rhyme,
While he insults o'er dull and speechless tribes.
And thou in this shalt find thy monument,
When tyrants' crests and tombs of brass are spent.



What's in the brain that ink may character,
Which hath not figured to thee my true spirit,
What's new to speak, what now to register,
That may express my love, or thy dear merit?
Nothing sweet boy, but yet like prayers divine,
I must each
--

O let it then as well beseem thy heart
To mourn for me since mourning doth thee grace,
And suit thy pity like in every part.
Then will I swear beauty herself is black,
And all they foul that thy complexion lack.



Beshrew that heart that makes my heart to groan
For that deep wound it gives my friend and me;
Is't not enough to torture me alone,
But slave to slavery my sweet'st friend must be?
Me from my self thy cruel eye hath taken,

--

---
samples after optimization:
---
 thislyly him him him him him him him him this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this him him this this him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him him
--
 thislylylylyly himly him him this this this this this this this this this him this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this him this this this this this this this this this this this this him this this this this him this him him thislyly him thisly this himlyly him him him him himlylyly him thislylylyly him himly himly himly himlyly him himlyly him himlyly himlylylylyly him
--
 thislylyly himlylylylylylylyly himlylylylylylylylylylyly himlyly thisly this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this
--
 this him him this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this
--
 thislylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylyly this thislyly this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this
--
 thislyly him him him this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this this
--
 thislylyly him him him him him him him him him him him him him him him him him him him him him him him him himly himlyly him him him him himly him him him him him him him him him him him himly him himlyly himlylylylylylylyly him him himly himlylylylyly him himlylylylylyly himlylylylylyly himlylyly himlylylylylylylylylylyly himlylylylylylylylylylylylylylylylylylylylyly
--
 this him him him him him this this this this this this this this this this this this this him this this him him him him him him him him him himly him him him him him him himlylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylylyly
--

---
main: opt->params.adam.sched 0.96000
Example 1, opt iter 112
error_before_opt: 57.475227
error_after_opt:  50.372532
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
main: opt->params.adam.sched 0.99964
Example 2, opt iter 128
error_before_opt: 51.942535
error_after_opt:  42.980034
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
Example:
---

The boy for trial needs would touch my breast,
I sick withal the help of bath desired,
And thither hied a sad distempered guest.
But found no cure, the bath for my help lies,
Where Cupid got new fire; my mistress' eyes.

The little Love-god lying once asleep,
Laid by his side his heart-inflaming brand,
Whilst many nymphs that vowed chaste life to keep,
Came tripping by, but in her maiden hand,
The fairest votary took
--
,
As the perfumed tincture of the roses,
Hang on such thorns, and play as wantonly,
When summer's breath their masked buds discloses:
But for their virtue only is their show,
They live unwooed, and unrespected fade,
Die to themselves. Sweet roses do not so,
Of their sweet deaths, are sweetest odours made:  
And so of you, beauteous and lovely youth,
When that shall vade, by verse distills your truth.


--
Therefore are feasts so solemn and so rare,
Since seldom coming in that long year set,
Like stones of worth they thinly placed are,
Or captain jewels in the carcanet.
So is the time that keeps you as my chest
Or as the wardrobe which the robe doth hide,
To make some special instant special-blest,
By new unfolding his imprisoned pride.
Blessed are you whose worthiness gives scope,
Being had to triumph, being lacked to hope.

What is your substance
--
Before these bastard signs of fair were born,
Or durst inhabit on a living brow:
Before the golden tresses of the dead,
The right of sepulchres, were shorn away,
To live a second life on second head,
Ere beauty's dead fleece made another gay:  
In him those holy antique hours are seen,
Without all ornament, it self and true,
Making no summer of another's green,
Robbing no old to dress his beauty new,
And him as for a map doth Nature store,
--
 love, and look for recompense,
More than that tongue that more hath more expressed.
O learn to read what silent love hath writ,
To hear with eyes belongs to love's fine wit.

Mine eye hath played the painter and hath stelled,
Thy beauty's form in table of my heart,
My body is the frame wherein 'tis held,
And perspective it is best painter's art.
For through the painter must you see his skill,
To find where your true image pictured lies,
Which in my bosom's shop
--
Which die for goodness, who have lived for crime.



Were't aught to me I bore the canopy,
With my extern the outward honouring,
Or laid great bases for eternity,
Which proves more short than waste or ruining?
Have I not seen dwellers on form and favour
Lose all, and more by paying too much rent
For compound sweet; forgoing simple savour,
Pitiful thrivers in their gazing spent?
No, let me be obsequious in thy heart,
And
--
Be absent from thy walks and in my tongue,
Thy sweet beloved name no more shall dwell,
Lest I (too much profane) should do it wronk:
And haply of our old acquaintance tell.  
For thee, against my self I'll vow debate,
For I must ne'er love him whom thou dost hate.



Then hate me when thou wilt, if ever, now,
Now while the world is bent my deeds to cross,
join with the spite of fortune, make me bow,
And do not
--
 says in him thy fair appearance lies.
To side this title is impanelled
A quest of thoughts, all tenants to the heart,
And by their verdict is determined
The clear eye's moiety, and the dear heart's part.
As thus, mine eye's due is thy outward part,
And my heart's right, thy inward love of heart.

Betwixt mine eye and heart a league is took,
And each doth good turns now unto the other,
When that mine eye is famished for a look,
Or heart
--

---
samples after optimization:
---
 in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in
--
 in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in
--
 in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in
--
 in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in
--
 in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in
--
 in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in
--
 in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in
--
 in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in in
--

---
main: opt->params.adam.sched 0.99807
Example 3, opt iter 144
error_before_opt: 53.764343
error_after_opt:  48.985619
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
main: opt->params.adam.sched 0.99523
Example 4, opt iter 160
error_before_opt: 45.549255
error_after_opt:  39.682442
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
Example:
---
 slave stay and think of nought
Save where you are, how happy you make those.
So true a fool is love, that in your will,
(Though you do any thing) he thinks no ill.

That god forbid, that made me first your slave,
I should in thought control your times of pleasure,
Or at your hand th' account of hours to crave,
Being your vassal bound to stay your leisure.
O let me suffer (being at your beck)
Th' imprisoned absence of your liberty,
And patience
--
 the object whereupon it gazeth,
A man in hue all hues in his controlling,
Which steals men's eyes and women's souls amazeth.
And for a woman wert thou first created,
Till nature as she wrought thee fell a-doting,
And by addition me of thee defeated,
By adding one thing to my purpose nothing.
But since she pricked thee out for women's pleasure,
Mine be thy love and thy love's use their treasure.
  
So is it not with me as with
--
So shall I live, supposing thou art true,
Like a deceived husband, so love's face,
May still seem love to me, though altered new:
Thy looks with me, thy heart in other place.
For there can live no hatred in thine eye,
Therefore in that I cannot know thy change,
In many's looks, the false heart's history
Is writ in moods and frowns and wrinkles strange.  
But heaven in thy creation did decree,
That in thy face sweet love should ever dwell,

--
 of thy worth gives thee releasing:
My bonds in thee are all determinate.  
For how do I hold thee but by thy granting,
And for that riches where is my deserving?
The cause of this fair gift in me is wanting,
And so my patent back again is swerving.
Thy self thou gav'st, thy own worth then not knowing,
Or me to whom thou gav'st it, else mistaking,
So thy great gift upon misprision growing,
Comes home again, on better judgement making
--
.
Duty so great, which wit so poor as mine
May make seem bare, in wanting words to show it;
But that I hope some good conceit of thine
In thy soul's thought (all naked) will bestow it:
Till whatsoever star that guides my moving,
Points on me graciously with fair aspect,
And puts apparel on my tattered loving,
To show me worthy of thy sweet respect,
Then may I dare to boast how I do love thee,
Till then, not show my head where
--
ine
In thy soul's thought (all naked) will bestow it:
Till whatsoever star that guides my moving,
Points on me graciously with fair aspect,
And puts apparel on my tattered loving,
To show me worthy of thy sweet respect,
Then may I dare to boast how I do love thee,
Till then, not show my head where thou mayst prove me.

Weary with toil, I haste me to my bed,
The dear respose for limbs with travel tired,
But then begins
--
 on the rarities of nature's truth,
And nothing stands but for his scythe to mow.
And yet to times in hope, my verse shall stand
Praising thy worth, despite his cruel hand.
  
Is it thy will, thy image should keep open
My heavy eyelids to the weary night?
Dost thou desire my slumbers should be broken,
While shadows like to thee do mock my sight?
Is it thy spirit that thou send'st from thee
So far from home into my deeds to pry,

--
 to death, oppressed with melancholy.
Until life's composition be recured,
By those swift messengers returned from thee,
Who even but now come back again assured,
Of thy fair health, recounting it to me.
This told, I joy, but then no longer glad,
I send them back again and straight grow sad.
  
Mine eye and heart are at a mortal war,
How to divide the conquest of thy sight,
Mine eye, my heart thy picture's sight would bar,
My heart, mine eye the
--

---
samples after optimization:
---
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
--
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
--
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
--
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
--
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
--
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
--
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
--
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
--

---
main: opt->params.adam.sched 0.99114
Example 5, opt iter 176
error_before_opt: 35.567841
error_after_opt:  33.275391
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
main: opt->params.adam.sched 0.98582
Example 6, opt iter 192
error_before_opt: 37.323936
error_after_opt:  36.570965
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
Example:
---
 my love excuse the slow offence,
Of my dull bearer, when from thee I speed,
From where thou art, why should I haste me thence?
Till I return of posting is no need.
O what excuse will my poor beast then find,
When swift extremity can seem but slow?
Then should I spur though mounted on the wind,
In winged speed no motion shall I know,
Then can no horse with my desire keep pace,
Therefore desire (of perfect'st love being made)
Shall neigh (no dull
--
 youthful morn
Hath travelled on to age's steepy night,
And all those beauties whereof now he's king
Are vanishing, or vanished out of sight,
Stealing away the treasure of his spring:  
For such a time do I now fortify
Against confounding age's cruel knife,
That he shall never cut from memory
My sweet love's beauty, though my lover's life.
His beauty shall in these black lines be seen,
And they shall live, and he in them still green
--
 salving thy amiss,
Excusing thy sins more than thy sins are:
For to thy sensual fault I bring in sense,
Thy adverse party is thy advocate,
And 'gainst my self a lawful plea commence:
Such civil war is in my love and hate,
That I an accessary needs must be,
To that sweet thief which sourly robs from me.
  
Let me confess that we two must be twain,
Although our undivided loves are one:
So shall those bl
--
 am, and they that level
At my abuses, reckon up their own,
I may be straight though they themselves be bevel;
By their rank thoughts, my deeds must not be shown
Unless this general evil they maintain,
All men are bad and in their badness reign.



Thy gift, thy tables, are within my brain
Full charactered with lasting memory,
Which shall above that idle rank remain  
Beyond all date even to eternity.
Or at the least, so long as brain and heart
Have faculty by
--
 should expiate.  
For all that beauty that doth cover thee,
Is but the seemly raiment of my heart,
Which in thy breast doth live, as thine in me,
How can I then be elder than thou art?
O therefore love be of thyself so wary,
As I not for my self, but for thee will,
Bearing thy heart which I will keep so chary
As tender nurse her babe from faring ill.
Presume not on thy heart when mine is slain,
Thou gav'st
--
zings have I felt, what dark days seen!
What old December's bareness everywhere!  
And yet this time removed was summer's time,
The teeming autumn big with rich increase,
Bearing the wanton burden of the prime,
Like widowed wombs after their lords' decease:
Yet this abundant issue seemed to me
But hope of orphans, and unfathered fruit,
For summer and his pleasures wait on thee,
And thou away, the very birds are mute.
Or if they sing,
--
 my mind being crowned with you
Drink up the monarch's plague this flattery?
Or whether shall I say mine eye saith true,
And that your love taught it this alchemy?
To make of monsters, and things indigest,
Such cherubins as your sweet self resemble,
Creating every bad a perfect best
As fast as objects to his beams assemble:
O 'tis the first, 'tis flattery in my seeing,
And my great mind most kingly drinks it up,
Mine eye well knows what
--
Or else receiv'st with pleasure thine annoy?
If the true concord of well-tuned sounds,
By unions married do offend thine ear,
They do but sweetly chide thee, who confounds
In singleness the parts that thou shouldst bear:  
Mark how one string sweet husband to another,
Strikes each in each by mutual ordering;
Resembling sire, and child, and happy mother,
Who all in one, one pleasing note do sing:
Whose speechless song being many, seeming one
--

---
samples after optimization:
---

































































































































--

































































































































--

































































































































--

































































































































--

































































































































--

































































































































--

































































































































--

































































































































--

---
main: opt->params.adam.sched 0.97926
Example 7, opt iter 208
error_before_opt: 38.377094
error_after_opt:  35.985100
used_mem_before_opt: 842581232 bytes
used_mem_after_opt:  843349840 bytes
Generating 1024 tokens.
ise of ladies dead, and lovely knights,
Then in the blazon of sweet beauty's best,
Of hand, of foot, of lip, of eye, of brow,
I see their antique pen would have expressed,
Even such a beauty as you master now.
So all their praises are but prophecies
Of this our time, all you prefiguring,
And for they looked but with divining eyes,
They had not skill enough your worth to sing:
For we---




,
,






,

,







 I
,
,,



,




,





,
,
,













,
,,
,
,





,

,

,







,
,



,

,,.,


,








,,,











,





,


,







,


,,

,

,



,














,
,

,




,


,





















,


,





,






,


,,

,


,,


,






,









,



,

,

,


,

















,


,

,







,


,




,


,,,
,



,,













,,

,


,

,

,




,

,



,
,



,

,,




,




,


,,











,,






,






,








,





,






,
,







,

,,

,



,
,
















,


,



,
























,
,
,










,

,





,






,,

,
,
,



,,
,

,
,


,,,
,



,
,







,


,




,
















,


,
,
,




,


.





,
,






,

,







,
,,




,


,,
,
,

,,





,
,



,,


 my




,
,


,
,










,

,
,,,






,.,




,

,

,

,







,





,


,






,




,,










,




,

,
















,


,,











 my










,
,,,,,

,
,,

,


,




,,



,



,

,
,
, thy,,
,

,






,




,





,


,

,
,
,















,
,

 ggerganov ▶ Georgis-MBP ▶ ~/development/github/llama.cpp/build-release ▶
 11:46:08 ▶ ⚓ master-7552ac5-77-g6b7487d ▶ ? ▶ 13⎘ ▶ $ ▶make -j && ./bin/main -m ggml-checkpoint-f32.bin -p "I believe the meaning of life is" -c 2048 --ignore-eos -s 3 -n 64 -ngl 0
[  2%] Generating build details from Git
-- Found Git: /opt/homebrew/bin/git (found version "2.39.0") 
[  5%] Built target ggml
[  5%] Built target BUILD_INFO
[ 11%] Built target llama
[ 28%] Built target test-quantize-fns
[ 31%] Built target test-quantize-perf
[ 31%] Built target quantize
[ 45%] Built target quantize-stats
[ 45%] Built target test-sampling
[ 45%] Built target test-tokenizer-0
[ 48%] Built target common
[ 54%] Built target baby-llama
[ 60%] Built target vdot
[ 65%] Built target train-text-from-scratch
[ 77%] Built target benchmark
[ 77%] Built target q8dot
[ 88%] Built target embedding
[ 88%] Built target save-load-state
[ 94%] Built target main
[100%] Built target perplexity
warning: not compiled with GPU offload support, --n-gpu-layers option will be ignored
warning: see main README.md for information on enabling GPU BLAS support
main: build = 683 (6b7487d)
main: seed  = 3
llama.cpp: loading model from ggml-checkpoint-f32.bin
llama_model_load_internal: format     = ggjt v3 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 2048
llama_model_load_internal: n_embd     = 256
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 8
llama_model_load_internal: n_layer    = 16
llama_model_load_internal: n_rot      = 32
llama_model_load_internal: ftype      = 0 (all F32)
llama_model_load_internal: n_ff       = 768
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size =    0.04 MB
llama_model_load_internal: mem required  = 1906.57 MB (+ 1026.00 MB per state)
.
llama_init_from_file: kv self size  =   32.00 MB

system_info: n_threads = 8 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | 
sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
generate: n_ctx = 2048, n_batch = 512, n_predict = 64, n_keep = 0


 I believe the meaning of life is
 shall, in my

,
 that
 my
, and this: th. be,., thy of  , not the

,And, my, that the
,. be me
 and thying love,
 of
st the my, with my a,y I':
llama_print_timings:        load time =    21.47 ms
llama_print_timings:      sample time =    46.00 ms /    64 runs   (    0.72 ms per token)
llama_print_timings: prompt eval time =    11.75 ms /     8 tokens (    1.47 ms per token)
llama_print_timings:        eval time =   194.44 ms /    63 runs   (    3.09 ms per token)
llama_print_timings:       total time =   268.70 ms

ggerganov avatar Jun 11 '23 08:06 ggerganov

Sorry for the noob question, but how to build that ?

Entretoize avatar Jun 11 '23 13:06 Entretoize

Running with thread sanitizer enabled, there seems to be a data race in ggml_compute_forward_flash_attn_back():

Ohhh.. thanks! That sounds interesting, will definitely look into that.

xaedes avatar Jun 11 '23 14:06 xaedes

@ggerganov

Pushed a fix for threaded index calculation, I think this probably was it.

Errors over time [without scratch, with flash] and [without scratch, without flash] are now identical:

Flash

Example 7, opt iter 8
error_before_opt: 62.218380
error_after_opt:  62.212128

No Flash

Example 7, opt iter 8
error_before_opt: 62.218380
error_after_opt:  62.212128

But with scratch buffers the errors go down slower:

Example 7, opt iter 8
error_before_opt: 62.231804
error_after_opt:  62.228424

Will have to take a closer look into that as well.

xaedes avatar Jun 11 '23 14:06 xaedes

What is a sequence of commands that I can try using to create a model? Currently, doing the steps that I showed in the previous comment results in non-coherent generation (i.e. lots of , and spaces), even though the error keeps going down after every train-text-from-scratch. Do I keep running it until it becomes better?

ggerganov avatar Jun 11 '23 15:06 ggerganov

What is a sequence of commands that I can try using to create a model?

I think the training with scratch buffers might contain some bug, currently searching for it. When training without scratch buffers it gives reasonable output.

Here a verbose command specifing most settings (I used the shakespeare.txt from above):

train-text-from-scratch --vocab-model ggml-model-with-vocab.bin --ctx 64 --embd 256 --head 8 --layer 16 --checkpoint-in chk-shakespeare-256x16.bin --checkpoint-out chk-shakespeare-256x16.bin --model-out ggml-shakespeare-256x16-f32.bin --train-data "shakespeare.txt" -t 6 -b 16 -n 32 --seed 1 --adam-iter 16 --print-details-interval 0 --predict 16 --no-scratch --use-flash

ggml-model-with-vocab.bin is some ggml model containing the vocabularly to use.

main -m ggml-shakespeare-256x16-f32.bin

Output of main:

And no face in their w heart:
Nor that I I my self thou heart,
O thy worth is thee not all love the to love,
Thy with my heart with loss from me of such them do shall me, and a them not not.
And precious, that love.
In such trust foring thy looks so, and thy self in thee to thy self will,
And disoth physance,,
Dt face.

If the adj receive thee, that I them for thyain,
Who not my love of thy skill,
In be more still more shall all so,
For,
But the best then so, and of my pen.
And is in my love to thee,
O on I have trust thision to keep in the world
As uncious, and will I me,
Bere,
And will all my self I those herion'ed that those receives sorrow,
And-ust for thy self in himce.
Myill

xaedes avatar Jun 11 '23 16:06 xaedes

just curious, would it really work out to fine tune llama with approx. 100 MB text on a Apple M2 Max in say one or two weeks? will this benefit from the recent metal additions?

daboe01 avatar Jun 11 '23 19:06 daboe01

Thanks @xaedes - the results look much better

I think the training with scratch buffers might contain some bug, currently searching for it.

Yeah, the scratch buffers are super difficult to use and debug. Need to come up with a better, more automated approach in the future. Anyway, let's see if we can find the issue and then proceed with merging.

ggerganov avatar Jun 11 '23 20:06 ggerganov

Yeah, the scratch buffers are super difficult to use and debug. Need to come up with a better, more automated approach in the future. Anyway, let's see if we can find the issue and then proceed with merging.

Resolved the scratch buffer issues, now it gives the same error values like the other variants.

more automated approach

This would really be helpful. Something like:

  • first create the tensors in a noalloc=true context
  • last usage of each tensor determines its required lifetime
  • do the magic to assign memory. i think it is related to register allocation, "the process of assigning local automatic variables and expression results to a limited number of processor registers."
  • recreate the tensors in the actual context with the final data pointers

From the wiki page register allocation: Graph-coloring allocation is the predominant approach to solve register allocation. In this approach, nodes in the graph represent live ranges (variables, temporaries, virtual/symbolic registers) that are candidates for register allocation. Edges connect live ranges that interfere, i.e., live ranges that are simultaneously live at at least one program point. Register allocation then reduces to the graph coloring problem in which colors (registers) are assigned to the nodes such that two nodes connected by an edge do not receive the same color.

Another thought: The tensors are in post-order, so they are topological sorted. But actually there is more than one topological sorted order. Calling ggml_visit_parents in different order results in other topological sorted orders. The necessary lifetime for each tensor differs between those. Reordering to reduce the necessary lifetimes would allow more memory reusage.

xaedes avatar Jun 11 '23 22:06 xaedes

Hey, thanks for this amazing PR! I was trying to use it with a Japanese training data out of curiosity, and I noticed the examples sometimes contain characters (which aren't in the training data):

Example 0 during Training

image

Output of the trained model

image

Igoorx avatar Jun 11 '23 23:06 Igoorx

I noticed the examples sometimes contain characters (which aren't in the training data)

Interesting. Tokens that are not in training data can be generated, but really should not show up in the printed examples. Could be a string encoding issue. I just read the bytes of the training data file and then use the llama tokenizer. Judging on the comment // split string into utf8 chars it seems to work on utf-8 encoded strings.

Try to save the training file with utf8 encoding, maybe that already resolves it?

If not could you give me some training data which results in this characters, so I can reproduce it locally?

xaedes avatar Jun 12 '23 00:06 xaedes

Try to save the training file with utf8 encoding, maybe that already resolves it?

It is already in utf-8, I also tried to save it with BOM but that didn't solve the issue.

Here is the training data: dataset.txt

Igoorx avatar Jun 12 '23 00:06 Igoorx

I noticed the examples sometimes contain � characters (which aren't in the training data)

I think the model pieced together an invalid multibyte(multitoken) utf8 character. Just train longer, and it should happen less and less often. :)

Green-Sky avatar Jun 12 '23 01:06 Green-Sky

I think the model pieced together an invalid multibyte(multitoken) utf8 character. Just train longer, and it should happen less and less often. :)

That doesn't seem to be the (only?) reason because the examples aren't generated by the model, they are taken from the training data.

Igoorx avatar Jun 12 '23 01:06 Igoorx