code2prompt
code2prompt copied to clipboard
Add firstLines handlebars helper
This PR adds a custom Handlebars helper called firstLines which works like the unix head command. This is for directories that contain .csv or .jsonl files since you only need to pass the first few lines to the LLM to have it understand the shape of the data, and the rest fills up the context for no reason.
This is best explained by example:
Example 1: Base Case
This shows the folder structure that I'm using for testing with a basic handlebars template.
template.hbs
Project Path: {{ absolute_code_path }}
Source Tree:
{{ source_tree }}
{{#each files}}
{{#if code}}
{{path}}:
{{code}}
{{/if}}
{{/each}}
output
Project Path: test_dir
Source Tree:
test_dir
├── code.js
└── data.csv
test_dir/code.js:
function main() {
console.log('hello world');
}
main();
test_dir/data.csv:
id,color
1,red
2,green
3,blue
1,red
2,green
3,blue
1,red
2,green
3,blue
Example 2: using firstLines
This example uses the firstLines helper by modifying the template in the following way:
- {{code}}
+ {{firstLines code 5}}
template_custom.hbs
Project Path: {{ absolute_code_path }}
Source Tree:
{{ source_tree }}
{{#each files}}
{{#if code}}
{{path}}:
{{firstLines code 5}}
{{/if}}
{{/each}}
output
Project Path: test_dir
Source Tree:
test_dir ├── code.js └── data.csv
test_dir/code.js:
```js
function main() {
console.log('hello world');
}
test_dir/data.csv:
id,color
1,red
2,green
3,blue
Next Steps / Questions
I would like to figure out a way to selectively apply the firstLines helper to data files such as .csv and .jsonl. And ideally you could selectively exclude certain files where you do in fact want the entire thing in the context. Any thoughts on how to approach this?
I think it would make sense to update website/src/content/docs/docs/tutorials/learn_templates.mdx as part of this PR, does that sound ok? I was thinking of adding a "data analysis" geared template example to demonstrate how this is useful.
Disclaimer: this is my first experience with "writing" Rust (I had Claude write the code and verified functionality) so please let me know if there are any issues.
@ODAncona for review
Hi @kym6464 ,
Thank you for contributing to code2prompt, I apologize for the delay of my answer. This is a really interesting feature you propose !
For specific file types use a specific parser
.JSON => one that only gets the first item to show the structure and doesn't show irrelevant data .CSV => header and first line .ipynb => only code and Markdown content and not cell output
Etc...
I liked the idea when I first read about it. However, this code is more like a workaround. We should implement a system that is modular in a way that anyone could develop the parser for their extension.
I feel it will greatly improve Code2Prompt effectiveness and usefulness. I recently improved the ‘code2prompt_core‘ crate for the TUI. This new extension parser logic should be implemented in the core.
At this point, I'm focusing on the TUI and code2promptrc (potential merge with code2prompt.py). It would be an easy and nice contribution for newcomers. That's completely OK to use Claude to get started. However, he has tendency to mess the codebase. Never forget what you want and iterate small and secure, best with unit tests.
Let me know on Discord, if you would like to keep working on the feature, I will be able to assist you 💪
Have a good day,
Olivier