guidance Regression in 0.1.15 Causes Incorrect Token Slicing in 'Role' Blocks

Regression in 0.1.15 Causes Incorrect Token Slicing in 'Role' Blocks

Open FoxBuchele opened this issue 9 months ago • 3 comments

The bug A regression was introduced in version 0.1.15 of the Guidance Library. Within 'role' blocks, the responses stored via 'capture' are missing tokens at the beginning and include extra tokens at the end, indicating incorrect slicing of the output. This issue does not occur in version 0.1.14.

To Reproduce Give a full working code snippet that can be pasted into a notebook cell or python file. Make sure to include the LLM load step so we know which model you are using.

import os

import guidance
from guidance import models, user, capture

relpath = ".\llmdata~\mistral-7b-instruct-v0.2.Q4_K_M.gguf"

fullpath = os.path.abspath(relpath)
if not os.path.isfile(fullpath):
    print("ERROR! The model file is not at "+relpath+" and we tried to create the absolute path: "+fullpath)

# 0.1.14
# Works as expected! 
#guidance_lm = models.MistralChat(fullpath, n_gpu_layers=-1, temperature=0.7, max_tokens=8194, n_batch=8194, top_p=0.95, n_ctx=8194, verbose=True, echo=False)

# 0.1.15
# Broken output (includes unwanted stop tokens, missing tokens at the beginning equal to number of extra stop tokens at end)
guidance_lm = models.LlamaCpp(fullpath, n_gpu_layers=-1, temperature=0.7, max_tokens=8194, n_batch=8194, top_p=0.95, n_ctx=8194, verbose=True, echo=False)


def tokenizer_issue():
    TokenizerTest = guidance_lm + capture("""This is a test of something we're going to move from place to place.""", "remember")
    alt_string = TokenizerTest["remember"]
    with user():
        test_lm =  guidance_lm + capture(TokenizerTest["remember"], "output")
        test_two = guidance_lm + capture(alt_string, "output_two")
        #Also occurs when NOT capturing LM state...
        test_three = guidance_lm + capture("This is an example to show that this issue only occurs within the role blocks...","output_three")

    confirm_lm = guidance_lm + capture(TokenizerTest["remember"], "finale")

    # Prints normally
    print("Works at start:")
    print(TokenizerTest["remember"])
    # Missing beginning, extra output characters that shouldn't be included at the end
    print("First error:")
    print(test_lm["output"])
    print("Second error:")
    print(test_two["output_two"])
    print("Third error:")
    print(test_three["output_three"])
    # Prints normally
    print("Works:")
    print(confirm_lm["finale"])

    #0.1.14 output:
    # Running LM Test...
    # ---------
    # Works at start:
    # This is a test of something we're going to move from place to place.
    # First error:
    # This is a test of something we're going to move from place to place.
    # Second error:
    # This is a test of something we're going to move from place to place.
    # Third error:
    # This is an example to show that this issue only occurs within the role blocks...
    # Works:
    # This is a test of something we're going to move from place to place.

    #0.1.15 output:
    # Running LM Test...
    # ---------
    # Works at start:
    # This is a test of something we're going to move from place to place.
    # First error:
    # a test of something we're going to move from place to place. [/INST]
    # Second error:
    # a test of something we're going to move from place to place. [/INST]
    # Third error:
    # an example to show that this issue only occurs within the role blocks... [/INST]
    # Works:
    # This is a test of something we're going to move from place to place.

    
if __name__ == "__main__":
    print("Running LM Test... ")
    print("---------")
    tokenizer_issue()

System info (please complete the following information):

OS: Windows 11
Guidance Version (guidance.__version__): 0.1.15

May 25 '24 02:05 FoxBuchele

guidance guidance copied to clipboard

Regression in 0.1.15 Causes Incorrect Token Slicing in 'Role' Blocks

guidance
guidance copied to clipboard