continue
continue copied to clipboard
[CON-258] Improve test coverage
Validations
- [X] I believe this is a way to improve. I'll try to join the Continue Discord for questions
- [X] I'm not able to find an open issue that requests the same enhancement
Problem
Continue could use more tests. It's a piece of software that is front-and-center in the developer experience, so every bit of reliability we can add is a big deal. UI changes often, but core functionality would ideally have high test coverage.
A (not necessarily comprehensive) list of some functionality that should be tested:
- [ ] Codebase indexing
- [X] walkDir
- [X] getComputeAddDeleteRemove <- here is where we decide which files need to be re-computed, re-labelled, removed, or deleted. It's how we decide to use progress from other branches, or to re-compute. There should be a suite of tests in which 1) a codebase is indexed, 2) the branch is switched with some differing files, 3) we check that only the changed files are actually marked as needing to be re-indexed
- [X] each of the CodebaseIndex classes should have a high-level test that calls its
updatefunction with some series of small changes, and checks between each that its databases are in the expected state - [ ] Started for a couple of languages: chunking tests for each of the supported code languages (to validate that we get top-level classes and functions, and that we correctly collapse large classes, and any other sensible situations that we should expect to capture). In every case, we should make sure that it's impossible to generate a chunk with size > MAX_CHUNK_SIZE
- [ ] markdown chunker
- [ ] Docs crawling: we should select a representative site from the most common 5+ types of docs sites (e.g. docusaurus, README, etc...). These tests should be defined declaratively with a SiteIndexingConfig mapped to a list of expected pages that should all be found by the crawler function
- [ ] All of the embeddings providers, rerankers, and LLMs should be tested to return expected responses. This work is already beginning in the openai-adapters package, so the real work to do here is incrementally migrate over all of our LLM providers to use the openai-adapters package. As we do this it would probably be prudent still to test the LLM classes themselves
- [ ] Test the
[renderPrompt](https://github.com/continuedev/continue/blob/dev/core/promptFiles/slashCommandFromPromptFile.ts#L49)function to validate that various .prompt files are rendered as expected - [X] High-level tests for the HistoryManager class
- [ ] Quick validation that logging tokens and then querying the database here returns the expected change
- [ ] While not every context provider might have a test, we should at least try to set up a system that makes it easy for the authors of context providers to write tests. This would begin with a function (
ContextProviderDescription, query, fullInput) ->Promise<ContextItem[]>, so that it's possible to get the output ofgetContextItemswithout needing to repeat all of the set up work of injectingContextProviderExtras - [ ] Many context providers aren't doing much more than requesting from one of the
IDEmethods, likegetDiff. We should test these fairly thoroughly in each IDE. For VS Code, those tests are started here - [ ] Autocomplete
- [X] Stream modifiers: each of the functions here and here is straightforward to test: the test can be declared as expected (input, output) string pairs, these pairs just being converted to/from async generators
- [ ] LLMs often like to repeat themselves, and this function is our last line of defense to filter out these bad completions. This test should just be a collection of input strings that we find in real usage, which we want to improve the implementation to filter out
- [ ] Tests to validate the construction of prompt templates
Solution
No response
From SyncLinear.com | CON-258