Define commitlint scopes
We currently use commitlint to ensure our commit messages adhere to the conventional commit standard. The current config is here https://github.com/i-am-bee/bee-agent-framework/blob/main/commitlint.config.ts
Unfortunately, we don't have a strict set of scopes that one can use, and therefore, scopes vary from commit to commit - this can be seen in our CHANGELOG.
Example:
tool: add google custom search tool (https://github.com/i-am-bee/bee-agent-framework/issues/34) ([ef839da](https://github.com/i-am-bee/bee-agent-framework/commit/ef839da933276c3deff21552e16627dfe9c3ef7d))
vs
tools: update Wikipedia tool, remove links, extend interface ([ee651c3](https://github.com/i-am-bee/bee-agent-framework/commit/ee651c38ae21711161dfd755987be8640519c9fc))
We should define a set of allowed scopes and put them inside the commitlint config via scopeEnum property (https://commitlint.js.org/reference/rules.html#scope-enum)
Example of allowed scopes:
- code-interpreter
- tools
- llms
- adapters
- serialization
- ...
I'd like to try this. Would you assign to me?
Assigned @akihikokuroda . Feel free to ask if you have any questions.
Here is the list of the scope enum that I'm thinking
- code-interpreter
- tools
- llms
- adapters
- serializer
- tests
- docs
- ci"
Are these OK?
Thanks!
Not really; you are mixing scopes with types. See https://www.conventionalcommits.org/en/v1.0.0/#summary
I see. How about these? Should "memory" and "cache" be in the list, too?
- code-interpreter
- tools
- llms
- adapters
- serializer
Thanks!
All modules should be in there. Your proposal captures very little scope.
What scope would you use if you changed something in the following locations?
-
src/internals/helpers/map.ts -
src/cache/fileCache.ts -
src/adapters/langchain/tools.ts -
docker-compose.yaml
and so on... think of all possible variations and then come up with a proper enum.
Thanks! I thought that one commit may include changes in multiple locations. I believe that some of files under src/internals, tests, docs and examples are often included in a commit. So it may be better to have general functions as the scope value. Do you want the scope to indicate changed files?
Yes, one commit may include changes in multiple locations. The final enum should cover all cases so far (see CHANGELOG, where you can see all commits and how they are formatted), and the author should provide a guide on how to use scopes with some examples, such as adding a new adapter or creating a new tool.
I see. Thanks! I'll work on it.
The scope values used so far are: agent, llm(s), tool(s), ibm-vllm, code-interpreter, serializer, cache, groq, watsonx, observability, bee-agent, utils, memory, execution, custom-tool, example, template, react, bam, python
I looked through the updated files in each commit with these scopes. The scope "agent", "llm" and "tools" are used for a lot of commits and updated files in these scopes are agent: 25, llm: 22, and tools: 34. It probably better to splits these scope to sub-scopes. The commits with other scopes include specific files for the commit and some common files.
Here are the list of scopes what I propose:
- bee-agent (agent/bee)
- parser-agent (agent/parser)
- agent (agent)
- bam-adapter (adapters/bam)
- groq-adapter (adapters/groq)
- ibm-vllm-adapter (adapters/ibm-vllm)
- langchain-adapter (adapters/langchain)
- ollama-adapter (adapters/ollama)
- openai-adapter (adapters/optnapi)
- shared-adapter (adapters/shared)
- watsonx-adapter (adapters/watsonx)
- llm (llms)
- database-tools (tools/database)
- python-tools (tools/python)
- search-tools (tools/search)
- weather-tools (tools/weather)
- web-tools (tools/web)
- tools (tools)
- cache (cache)
- emitter
- utils (internals)
- logger (logger)
- memory (memory)
- serializer (serializer)
- code-interpreter-infra (infra/code-interprter)
- tests (tests)
- docs (docs)
- examples (examples)
- scripts (scripts)
The order in the list is the priority of the scope. Each scope has an associated directory in the project. The file in the highest priority scope in the commit determines the scope of the commit. When a new adapter, tool or a major new sub component created, it must be added as a new scope in commitlint.ts file.
I think these are a bit confusing and unnecessary (mixing types with scopes as @Tomas2D said):
- tests
- docs
Probably the changes to the instrumentation of tests or docs will happen way less often than actually documenting or testing something.
If you add a new test, I'd use a commit message
test(tools): add wikipedia tool test
The following would make sense, but in my opinion it's really unnecessary, when you extend test utils for example, you do because you need it for a specific test
fix(tests): skipping e2e tests due to wrong condition
"utils" are too general
Also I'm not sure whether we should distinguish individual adapters and tools. I'd rather if don't need to change the commitlint configuration everytime you add a new tool or adapter, I think it's unnecessary strict for new contributors and you still have the commit message to express the actual subtype of the tool. Also, the following will probably only ever contain one tool, we're not planning to have multiple python tools. I'd flip it to tools-python if we want to go with the specific ones:
database-tools (tools/database)
python-tools (tools/python)
search-tools (tools/search)
weather-tools (tools/weather)
web-tools (tools/web)
tools (tools)
I think a good rule of thumb would be to follow the directory structure of src/.
(except index and version)
If there is something that does not fit, for example a change in src/agents/parsers/field.ts will extend parsers to be a generalized concept, we should move parsers to the top level.
"examples" feels like a type to me more than a scope, also you should add example as part of the feature, if you add new adapter, the example for it should be in the same commit.
Then we need couple other for the houskeeping stuff:
-
infra- "e.g. changingdocker-compose.yml -
depsfor bumping dependencies
If the commit does not obviously fall into any of the categories, we should allow to leave the scope empty
(e.g. chore: update tsconfig ...)
Following the directory structure is good. When the changes are in multiple directories, the committer can put the scope based on the directory that got the most significant or important changes.
I put the sub directories of the adapter because there are some scopes used so far for specific adapter (ibm-vllm, groq, watsonx, groq, watsonx).
The infra may not be a good name because there is a infra directory already. It may be confusing. The utils has been used to update the files in internals directory.
I don't have strong opinion about the list of scopes so I just need to get consensus in this community. I think the scope is used to
- assign the right reviewer of the commit
- find commits that might cause some regression issue
in addition to the type. So as far as the scope value helps these, I think it's good.
For now, the proposed list is:
- adapters
- agents
- llms
- tools
- cache
- emitter
- internals
- logger
- memory
- serializer
- repo-infra
- deps