bee-agent-framework Define commitlint scopes

We currently use commitlint to ensure our commit messages adhere to the conventional commit standard. The current config is here https://github.com/i-am-bee/bee-agent-framework/blob/main/commitlint.config.ts

Unfortunately, we don't have a strict set of scopes that one can use, and therefore, scopes vary from commit to commit - this can be seen in our CHANGELOG.

Example:

tool: add google custom search tool (https://github.com/i-am-bee/bee-agent-framework/issues/34) ([ef839da](https://github.com/i-am-bee/bee-agent-framework/commit/ef839da933276c3deff21552e16627dfe9c3ef7d))

vs

tools: update Wikipedia tool, remove links, extend interface ([ee651c3](https://github.com/i-am-bee/bee-agent-framework/commit/ee651c38ae21711161dfd755987be8640519c9fc))

We should define a set of allowed scopes and put them inside the commitlint config via scopeEnum property (https://commitlint.js.org/reference/rules.html#scope-enum)

Example of allowed scopes:

code-interpreter
tools
llms
adapters
serialization
...

Oct 02 '24 12:10 Tomas2D

I'd like to try this. Would you assign to me?

Oct 02 '24 13:10 akihikokuroda

Assigned @akihikokuroda . Feel free to ask if you have any questions.

Oct 02 '24 15:10 Tomas2D

Here is the list of the scope enum that I'm thinking

code-interpreter
tools
llms
adapters
serializer
tests
docs
ci"

Are these OK?
Thanks!

Oct 02 '24 15:10 akihikokuroda

Not really; you are mixing scopes with types. See https://www.conventionalcommits.org/en/v1.0.0/#summary

Oct 02 '24 15:10 Tomas2D

I see. How about these? Should "memory" and "cache" be in the list, too?

code-interpreter
tools
llms
adapters
serializer

Thanks!

Oct 02 '24 15:10 akihikokuroda

All modules should be in there. Your proposal captures very little scope.

What scope would you use if you changed something in the following locations?

src/internals/helpers/map.ts
src/cache/fileCache.ts
src/adapters/langchain/tools.ts
docker-compose.yaml

and so on... think of all possible variations and then come up with a proper enum.

Oct 03 '24 14:10 Tomas2D

Thanks! I thought that one commit may include changes in multiple locations. I believe that some of files under src/internals, tests, docs and examples are often included in a commit. So it may be better to have general functions as the scope value. Do you want the scope to indicate changed files?

Oct 03 '24 14:10 akihikokuroda

Yes, one commit may include changes in multiple locations. The final enum should cover all cases so far (see CHANGELOG, where you can see all commits and how they are formatted), and the author should provide a guide on how to use scopes with some examples, such as adding a new adapter or creating a new tool.

Oct 03 '24 15:10 Tomas2D

I see. Thanks! I'll work on it.

Oct 03 '24 15:10 akihikokuroda

The scope values used so far are: agent, llm(s), tool(s), ibm-vllm, code-interpreter, serializer, cache, groq, watsonx, observability, bee-agent, utils, memory, execution, custom-tool, example, template, react, bam, python

I looked through the updated files in each commit with these scopes. The scope "agent", "llm" and "tools" are used for a lot of commits and updated files in these scopes are agent: 25, llm: 22, and tools: 34. It probably better to splits these scope to sub-scopes. The commits with other scopes include specific files for the commit and some common files.

Here are the list of scopes what I propose:

bee-agent (agent/bee)
parser-agent (agent/parser)
agent (agent)
bam-adapter (adapters/bam)
groq-adapter (adapters/groq)
ibm-vllm-adapter (adapters/ibm-vllm)
langchain-adapter (adapters/langchain)
ollama-adapter (adapters/ollama)
openai-adapter (adapters/optnapi)
shared-adapter (adapters/shared)
watsonx-adapter (adapters/watsonx)
llm (llms)
database-tools (tools/database)
python-tools (tools/python)
search-tools (tools/search)
weather-tools (tools/weather)
web-tools (tools/web)
tools (tools)
cache (cache)
emitter
utils (internals)
logger (logger)
memory (memory)
serializer (serializer)
code-interpreter-infra (infra/code-interprter)
tests (tests)
docs (docs)
examples (examples)
scripts (scripts)

The order in the list is the priority of the scope. Each scope has an associated directory in the project. The file in the highest priority scope in the commit determines the scope of the commit. When a new adapter, tool or a major new sub component created, it must be added as a new scope in commitlint.ts file.

Oct 04 '24 14:10 akihikokuroda

I think these are a bit confusing and unnecessary (mixing types with scopes as @Tomas2D said):

tests
docs

Probably the changes to the instrumentation of tests or docs will happen way less often than actually documenting or testing something.

If you add a new test, I'd use a commit message

test(tools): add wikipedia tool test

The following would make sense, but in my opinion it's really unnecessary, when you extend test utils for example, you do because you need it for a specific test

fix(tests): skipping e2e tests due to wrong condition

"utils" are too general

Also I'm not sure whether we should distinguish individual adapters and tools. I'd rather if don't need to change the commitlint configuration everytime you add a new tool or adapter, I think it's unnecessary strict for new contributors and you still have the commit message to express the actual subtype of the tool. Also, the following will probably only ever contain one tool, we're not planning to have multiple python tools. I'd flip it to tools-python if we want to go with the specific ones:

database-tools (tools/database)
python-tools (tools/python)
search-tools (tools/search)
weather-tools (tools/weather)
web-tools (tools/web)
tools (tools)

I think a good rule of thumb would be to follow the directory structure of src/. (except index and version)

If there is something that does not fit, for example a change in src/agents/parsers/field.ts will extend parsers to be a generalized concept, we should move parsers to the top level.

"examples" feels like a type to me more than a scope, also you should add example as part of the feature, if you add new adapter, the example for it should be in the same commit.

Then we need couple other for the houskeeping stuff:

infra - "e.g. changing docker-compose.yml
deps for bumping dependencies

If the commit does not obviously fall into any of the categories, we should allow to leave the scope empty (e.g. chore: update tsconfig ...)

Oct 08 '24 09:10 jezekra1

Following the directory structure is good. When the changes are in multiple directories, the committer can put the scope based on the directory that got the most significant or important changes.
I put the sub directories of the adapter because there are some scopes used so far for specific adapter (ibm-vllm, groq, watsonx, groq, watsonx). The infra may not be a good name because there is a infra directory already. It may be confusing. The utils has been used to update the files in internals directory.

I don't have strong opinion about the list of scopes so I just need to get consensus in this community. I think the scope is used to

assign the right reviewer of the commit
find commits that might cause some regression issue

in addition to the type. So as far as the scope value helps these, I think it's good.

For now, the proposed list is:

adapters
agents
llms
tools
cache
emitter
internals
logger
memory
serializer
repo-infra
deps

Oct 08 '24 17:10 akihikokuroda