openai-cookbook icon indicating copy to clipboard operation
openai-cookbook copied to clipboard

Fixed bug for subsequent chunks

Open JOHW85 opened this issue 1 year ago • 1 comments

JOHW85 avatar Mar 29 '23 11:03 JOHW85

Oh yes, clearly a bug. I think the rewrite has to be larger (the if statement shouldn't add ntokens again, and there's no need for +2 for the first chunk of the batch):

"    for chunk, ntoken in zip(chunks, ntokens):\n",
"        cur_tokens += ntoken + 2  # +2 for the newlines between chunks\n",
"\n",
"        # if adding this chunk would exceed the max length, finalize the current batch and start a new one\n",
"        if cur_tokens > max_len:\n",
"            batches.append(cur_batch)\n",
"            cur_batch = chunk\n",
"            cur_tokens = ntoken\n",
"        else:\n",
"            cur_batch += \"\\n\\n\" + chunk\n",
"    batches.append(cur_batch)\n",

BorisPower avatar May 19 '23 03:05 BorisPower

Fixed in https://github.com/openai/openai-cookbook/pull/579. Closing PR. Thanks for flagging!

ted-at-openai avatar Jul 12 '23 00:07 ted-at-openai

(Made a couple of other changes, including +2 tokens to +1, as \n\n is a single-token in the GPT-3 encoding.)

ted-at-openai avatar Jul 12 '23 00:07 ted-at-openai