preserve indentation for code output
will matter more when #54 is implemented but
we need to fix the text chunking code in the runner so that it preserves whitespace properly which matters for e.g. code generation (will be easier to see in the browser when we add markdown support and if the model spits out ``` s)
cases we care about:
- split on words so user gets quick updates in each chunk
- preserve newlines
- preserve whitespace (e.g. paragraphs of prose is \n\n vs. newlines in poetry is \n, code where indentation matters: four spaces, tabs)
we can probably make the scanner just split on the first whitespace but include it and not skip the next 3 (in the case that the model outputs four space characters in a row) - and then stop adding a space back in (because we shouldn't be assuming the whitespace is a space character
related changes:
- https://github.com/helixml/helix/commit/82352b43136564388edcca3390a3e8975dee8f53
- https://github.com/helixml/helix/commit/f8d630a4bd209bf18d1e89927c8cfbead04b60d8
slack thread: https://mlops-community.slack.com/archives/C0675EX9V2Q/p1705056847386509
as said above, the chunks should just include all the whitespace as well as being split on the whitespace, and we don't re-add a hard-coded space between chunks in the api server
while we're in there, we should clean up the newline at the start and end of every response currently