llm-ui
llm-ui copied to clipboard
Indented code block is not working properly
Hello, thank you for the great library!
Not sure how to properly set up library to correctly identifier code blocks like (where code is indented)
### Sample Code Review
1. **Variable Naming:**
- **Observation:** The variable `total` is used to store the sum of the prices and their tax. While it is functional, the name could be more descriptive.
- **Suggestion:** Consider renaming `total` to `total_price` to make the variable name more explicit and aligned with its purpose.
**Updated Code:**
```python
total_price = 0
Is there any particular reason why findCompleteCodeBlock and codeBlockLookBack contain regex that ignores spaces before ```?
e.g. would it be useful to modify findCompleteCodeBlock to have
const regex = new RegExp(
`${startEndGroup}.*\n([\\s\\S]*?)\n\\s*${startEndGroup}`,
);
and codeBlockLookBack
export const parseCompleteMarkdownCodeBlock: ParseFunction = (
codeBlock,
userOptions,
) => {
const options = getOptions(userOptions);
const startEndGroup = getStartEndGroup(options.startEndChars);
return parseMarkdownCodeBlock(codeBlock, startEndGroup, `\n\\s*${startEndGroup}`);
};
export const parsePartialMarkdownCodeBlock: ParseFunction = (
codeBlock,
userOptions,
) => {
const options = getOptions(userOptions);
const startGroup = getStartEndGroup(options.startEndChars);
const endGroup = `(\n\\s*${options.startEndChars
.map((char) => `${char}{0,2}$`)
.join("|")}|$)`;
return parseMarkdownCodeBlock(codeBlock, startGroup, endGroup);
};
Or am I just missing something here and it will break something else? Actually, GitHub has the same issue with indented code blocks and I am assuming there is a reason for that. Thank you
According to the CommonMark spec, fenced code blocks can be indented up to three spaces.
Currently, llm-ui’s regex requires code fences to start at column 0, which means valid cases like this aren’t recognized:
print("hello")
3 spaces before the backticks is valid CommonMark.
A possible spec-compliant fix would be to adjust the regex in findCompleteCodeBlock and codeBlockLookBack from requiring `^``` to allowing {0,3} before the fence For example:
const regex = new RegExp(
` {0,3}${startEndGroup}.*\n([\\s\\S]*?)\n {0,3}${startEndGroup}`,
);
Would you be open to a PR with this change?