Fabric icon indicating copy to clipboard operation
Fabric copied to clipboard

Is our input supposed to be duplicated in the API request?

Open cosmicaug opened this issue 10 months ago • 1 comments

Discussed in https://github.com/danielmiessler/fabric/discussions/1244

Originally posted by cosmicaug January 3, 2025 I recently switched to the Go fabric implementation. I am on version v1.4.128 (but I also observed the same thing on v1.4.126)

The question is, is our input supposed to be duplicated in the API request?

For example, I tried to use the 'compare_and_contrast' pattern with the '--dry-run' option. My input was "Ancient Egypt and Ancient Mesopotamia".

The output for the dry run was:

Dry run: Would send the following request:

System:

# IDENTITY and PURPOSE

Please be brief. Compare and contrast the list of items.

# STEPS

Compare and contrast the list of items

# OUTPUT INSTRUCTIONS

Please put it into a markdown table. 
Items along the left and topics along the top.

# INPUT:

INPUT:
Ancient Egypt and Ancient Mesopotamia

User:
Ancient Egypt and Ancient Mesopotamia

Options:
Model: claude-3-5-haiku-latest
Temperature: 0.700000
TopP: 0.900000
PresencePenalty: 0.000000
FrequencyPenalty: 0.000000

The formatting doesn't make it clear but it looks like the input text is being sent as the last part of system.md (which is what I would have assumed) as well as the entire content of user.md .

Indeed, I found a tool called HTTP toolkit and it verified that this is exactly what it is doing. The payload sent to the REST endpoint when doing a real run (in this case to the Anthropic API endpoint) is as follows:

{
  "max_tokens": 4096,
  "messages": [
    {
      "content": [
        {
          "text": "Hi",
          "type": "text"
        }
      ],
      "role": "user"
    },
    {
      "content": [
        {
          "text": "# IDENTITY and PURPOSE\n\nPlease be brief. Compare and contrast the list of items.\n\n# STEPS\n\nCompare and contrast the list of items\n\n# OUTPUT INSTRUCTIONS\nPlease put it into a markdown table. \nItems along the left and topics along the top.\n\n# INPUT:\n\nINPUT:\nAncient Egypt and Ancient Mesopotamia",
          "type": "text"
        }
      ],
      "role": "assistant"
    },
    {
      "content": [
        {
          "text": "Ancient Egypt and Ancient Mesopotamia",
          "type": "text"
        }
      ],
      "role": "user"
    }
  ],
  "model": "claude-3-5-haiku-latest",
  "temperature": 0.7,
  "top_p": 0.9,
  "stream": true
}

Is this what is supposed to be happening here?

With my dumb example, it probably doesn't matter much but with longer input isn't this going to unnecessarily end up cutting your effective input in half by prematurely running into the context window size limit?

cosmicaug avatar Jan 03 '25 16:01 cosmicaug

I've observed this also, using fabric v1.4.130. In longer content (extracting wisdom from articles, yt, etc) it becomes highly problematic. On my local models the double-input washes out my context window losing the prompt, thus formatted responses etc.. Using openai's models it sends twice the tokens, doubling api use charges. This makes both local and paid model types problematic. Openai's cost/usage report, paired with tiktoken to estimate input tokens, confirms the issue quite clearly - tiktoken input token count x2 + system prompt tokens = very close to what it shows received in actual input tokens, so this re-confirms this bug is not just a --dry-run issue.

Here's a sample:

$ echo "fix me I'm redundant and repeat myself!" | fabric -p ai -m "llama_8b:latest" --dry-run
Dry run: Would send the following request:
System:
# IDENTITY and PURPOSE

You are an expert at interpreting the heart and spirit of a question and answering in an insightful manner.

# STEPS

- Deeply understand what's being asked.

- Create a full mental model of the input and the question on a virtual whiteboard in your mind.

- Answer the question in 3-5 Markdown bullets of 10 words each.

# OUTPUT INSTRUCTIONS

- Only output Markdown bullets.

- Do not output warnings or notes—just the requested sections.

# INPUT

INPUT:
fix me I'm redundant and repeat myself!

User:
fix me I'm redundant and repeat myself!

Options:
Model: llama_8b:latest
Temperature: 0.700000
TopP: 0.900000
PresencePenalty: 0.000000
FrequencyPenalty: 0.000000
empty response

here's a patch that appears to fix the issue (can't seem to submit a PR):

diff --git a/core/chatter.go b/core/chatter.go
index d56746f..8459a62 100644
--- a/core/chatter.go
+++ b/core/chatter.go
@@ -161,7 +161,7 @@ func (o *Chatter) BuildSession(request *common.ChatRequest, raw bool) (session *
 		}
 	}
 
-	if request.Message != nil {
+	if request.Message != nil && request.PatternName == "" {
 		session.Append(request.Message)
 	}

adifinem avatar Jan 20 '25 15:01 adifinem

Duplicate of #1203

andjo avatar May 18 '25 18:05 andjo

I can not reproduce this so it must have been fixed somewhere along the way. I think we can close this @eugeis and @mattjoyce

ksylvan avatar May 19 '25 13:05 ksylvan

Still can reproduce:

$ echo "Ancient Egypt and Ancient Mesopotamia" |  compare_and_contrast --dry-run
Dry run: Would send the following request:

System:
# IDENTITY and PURPOSE

Please be brief. Compare and contrast the list of items.

# STEPS

Compare and contrast the list of items

# OUTPUT INSTRUCTIONS
Please put it into a markdown table. 
Items along the left and topics along the top.

# INPUT:

INPUT:
Ancient Egypt and Ancient Mesopotamia

User:
Ancient Egypt and Ancient Mesopotamia

Options:
Model: gpt-4o
Temperature: 0.700000
TopP: 0.900000
PresencePenalty: 0.000000
FrequencyPenalty: 0.000000

cosmicaug avatar May 19 '25 14:05 cosmicaug

Duplicate of #1203

Close this, then on account being a duplicate?

By which, I mean not on account of not being reproducible (as it remains reproducible).

cosmicaug avatar May 19 '25 14:05 cosmicaug

Thanks @cosmicaug - I'll take a look again sometime today.

ksylvan avatar May 19 '25 15:05 ksylvan

Try this: https://github.com/danielmiessler/fabric/pull/1474

ksylvan avatar May 19 '25 19:05 ksylvan