CodeCompass icon indicating copy to clipboard operation
CodeCompass copied to clipboard

Efferent Coupling at Module Level for C++

Open mcserep opened this issue 2 years ago • 3 comments

Coupling is metric which can be computed for structural units at different levels (e.g. classes, namespaces, modules). It measures how many other entities an entity depends on; and how many dependants it has.

The Efferent Coupling for a particular module is the number of types inside this module that depends on types outside this module.

High efferent coupling indicates that the concerned module is dependant.

mcserep avatar Nov 08 '23 13:11 mcserep

What do we consider a module in this case? An entire directory? So, for example, in CodeCompass, the plugins, parser and model directories are 3 different modules?

Another option is to consider the C++20 feature module but I guess it's not that widespread.

intjftw avatar Nov 14 '23 13:11 intjftw

What do we consider a module in this case? An entire directory? So, for example, in CodeCompass, the plugins, parser and model directories are 3 different modules?

Another option is to consider the C++20 feature module but I guess it's not that widespread.

A module is an abstract concept here. In case of CodeCompass parser is a module, but plugins is not a module, instead plugins/cpp is a module, as that is a cohesive logical unit of the program. But plugins/cpp/parser could also be considered a submodule in my view.

I see the following options:

  1. A module is a C++20 module. Since we would like this feature to work pre C++20 codebases, we should not do this.
  2. A module is defined by the user in some format. This would require significant work on specifying and implementing how it is defined, thus I suggest not to go on this way.
  3. A module is a directory. Any directory, so in this case plugins, plugins/cpp, plugincs/cpp/parser and plugincs/cpp/parser/src could all be interpreted as modules. In this case we evaluate the module level metric for all directories, but it won't make sense for all directories. It will depend on the user to query the metric for the directories, where it is useful. I see this as a possible option.
  4. A module is a namespace and we expect that the namespace hierarchy of the project analyzed is properly maintained. This is an alternative solution.

I will further discuss this question with @zporky today.

mcserep avatar Nov 15 '23 06:11 mcserep

@intjftw Update: last week we discussed with @zporky on the weekly meeting, that each directory should be considered as a module. Later maybe it would be wise to require some input (a whitelist) about which directories are interesting, so we do not calculate these metrics for directories, where it does not make sense, but for now, we roll with this approach.

mcserep avatar Nov 29 '23 12:11 mcserep

last week we discussed with @zporky on the weekly meeting, that each directory should be considered as a module.

Is this still the case? In PR https://github.com/Ericsson/CodeCompass/pull/747 a new approach was suggested to specify a flag (such as -m) to a file which lists directories as modules. Which method should we implement for this metric?

The Efferent Coupling for a particular module is the number of types inside this module that depends on types outside this module.

I have a few questions below about this definition to make sure the right calculation is implemented:

"outside this module"

Does the outside directory has to be specified as a module (e.g. listed in the file specified by -m flag) or not?

Let's say Module A uses types T1, T2 which are not part of any module. Would that increase efferent coupling for Module A by 2?

"is the number of types inside this module"

What is considered exactly a type in this case? Let's say Module A has a function foo which is not part of any type (e.g. class) and this function uses type T defined outside of this module. Would that count?

How about functions? Let's say Module A uses 3 different functions (such as foo, bar, test) defined outside of this module. Would that increase efferent coupling by 3 for Module A?

My initial idea was to check all usages in a module (directory) and if their definition is defined outside of this module we simply increase efferent coupling. This would be the fastest one to calculate, however I'm not sure if this would be the intended behavior.

barnabasdomozi avatar Feb 27 '25 14:02 barnabasdomozi

Hi @wbqpk3,

Is this still the case? In PR https://github.com/Ericsson/CodeCompass/pull/747 a new approach was suggested to specify a flag (such as -m) to a file which lists directories as modules. Which method should we implement for this metric?

When the -m flag is defined, only folders defined there (it should load a text file listing the path of the modules) should be considered modules. If -m is omitted, all folders should be considered modules. (A fallback for small and medium sized projects.)

I have a few questions below about this definition to make sure the right calculation is implemented:

First of all, I think there was a wording error in this issue's description, so I updated the definition of efferent coupling for modules to be more coherent with the other coupling metrics:

The Efferent Coupling for a particular module is the number of types outside this module on which types inside this module directly depends.

Does the outside directory has to be specified as a module (e.g. listed in the file specified by -m flag) or not?

Let's say Module A uses types T1, T2 which are not part of any module. Would that increase efferent coupling for Module A by 2?

I would say yes, they should be accounted as well. We could also discuss this one with @zporky.

What is considered exactly a type in this case? Let's say Module A has a function foo which is not part of any type (e.g. class) and this function uses type T defined outside of this module. Would that count?

No, focus on types.

How about functions? Let's say Module A uses 3 different functions (such as foo, bar, test) defined outside of this module. Would that increase efferent coupling by 3 for Module A?

If they are not inside a class, I would say omit them.

My initial idea was to check all usages in a module (directory) and if their definition is defined outside of this module we simply increase efferent coupling. This would be the fastest one to calculate, however I'm not sure if this would be the intended behavior.

I don't think that would be the intended behaviour, but please check the updated definition of efferent coupling for modules, maybe it makes it more clear.

mcserep avatar Mar 04 '25 09:03 mcserep

When the -m flag is defined, only folders defined there (it should load a text file listing the path of the modules) should be considered modules. If -m is omitted, all folders should be considered modules. (A fallback for small and medium sized projects.)

Thank you for clarifying, this makes sense.

The Efferent Coupling for a particular module is the number of types outside this module on which types inside this module directly depends.

This updated definition is now a bit more clear to me. If I understand this correctly, we need to go through all user-defined types (e.g. classes) in a module and if those classes use types defined outside of this module we count them.

Let's say Module A has a class C1. C1 uses types T1 and T2 defined outside of this module. In this case, efferent coupling for Module A is 2 (counting T1 and T2)?

Edit: I was thinking about a more efficient implementation for this metric. How about we create a new database table where we store how types depend on each other? Using this new table we can calculate module metrics a lot more efficiently.

barnabasdomozi avatar Mar 04 '25 17:03 barnabasdomozi

@mcserep Update: Implemented modules flag in a separate PR: https://github.com/Ericsson/CodeCompass/pull/782

barnabasdomozi avatar Mar 10 '25 16:03 barnabasdomozi

@wbqpk3 Okay, let's try it with a new table for type dependency relations.

mcserep avatar Mar 19 '25 15:03 mcserep