aicommits
aicommits copied to clipboard
Quick improvement suggestion: Reduce diff size with --ignore-all-space
First, thanks a lot for building this tool!
The current limitation of 200 lines could be improved by adding --ignore-all-space
flag, which will produce more concise diff, for example when you wrap some lines in extra if
or <div>
or anything that just shifts the indentation one way or another.
I guess the code changes will be minimal. From this:
export const getStagedDiff = async () => {
const diffCached = ['diff', '--cached'];
const { stdout: files } = await execa(
'git',
To this:
export const getStagedDiff = async () => {
const diffCached = ['diff', '--cached', '--ignore-all-space'];
const { stdout: files } = await execa(
'git',
I doubt it makes sense to parameterize this change. Am I wrong? Happy to create a PR if needed.
I thought about this, but some commits are just white-space changes (e.g. stylistic/linting).
If the diff doesn't show, GPT wouldn't be able to produce an accurate description of the change.
I wonder if we can be smart about this and detect how much of the diff is white-space changes.
If the diff is mainly white-space change, we can pass in the diff with white-space.
If not (or if the diff is close to the OpenAI size limit), we can pass in the white-space ignored diff.
smaller diff text:
const diffCached = ['diff', '--cached','--ignore-all-space','--diff-algorithm=minimal'];
It could be a conf option, but for many cases detecting these changes are important, particularly for languages where indentation matters and as above - ie formatting & linting changes. How about just counting the tokens with tokeniser and set it to the models max number instead of line count? and allow people to set a limit themselves.
Or actually even better, now that the new chat API is available and they take turns, the entire diff could be consumed file by file actually if detect it's bigger than some threshold and have a turn by turn chat, managing then to generate a comment about each and then asking to summarise each comment in the end. This would allow to work with quite large diffs as well.
https://github.com/openai/openai-python/blob/main/chatml.md https://platform.openai.com/docs/guides/chat/chat-vs-completions
I don't think we can ignore white space for the reason provided above, and we've added --diff-algorithm=minimal
so I think this is closable.
@salomartin That idea sounds interesting but I worry that it may (unexpectedly) incur large expenses for generating a commit.