Erik Bjäreholt

Results 333 issues of Erik Bjäreholt

Implemented with gptme, given moatless-tools and aider as reference implementations. - [x] Set up harness - [ ] Get a single eval instance passing - [ ] Gets stuck at...

Progress on https://github.com/ErikBjare/gptme/issues/59

Sometimes the models just can't help themselves, and insist on writing unified diffs patches. Figured I might as well try implementing support for it. I got Claude 3.5 to write...

Relevant to https://github.com/ErikBjare/gptme/issues/23 and https://github.com/ErikBjare/gptme/issues/32 Has several issues: - can't display the gptme/logmanager.py file (or the gptme/log2html.py file after my refactor), causes the entire message to be empty - log...

Had this idea for a better terminal UI using ncurses. gptme wrote all of it, I just kept asking for improvements. It did really well with the tmux tool when...

Right now we just count the token length of the chat log, we should capture the actual spend. Not sure if very high priority, not obvious how to do it...

evals

The models sometimes get confused about timestamps being "in the future", giving them a current date would fix that. The current dates would get outdated for old conversations, as the...

Right now gptme can't write codeblocks which contain other codeblocks, due to early stopping. I worked on a thing that tried to count start tags and end tags so that...

bug

Looks really nice: https://docs.ell.so/core_concepts/ell_complex.html Not sure if a good fit for gptme though, but looks very similar in approach.

Markdown has some pretty rough limitations that become especially apparent once you need to nest it. A large class of bugs and workarounds could be eliminated if we switched to...

enhancement