Long term plan to unify public zephyr APIs using ONE prefix
Introduction
Our APIs are all over the place with various prefixes and legacy conventions that do not have any consistency or real meaning. We do mix internal API prefixes with public APIs in many places and some have decvided to use z_ as the prefix for public APIs in some places, adding more fuel to the fire.
Proposal is to pick some prefix and enforce it for all new public C identifiers, over time, (scale of years) migrate all APIs to this prefix.
Ideally we want to release LTS4 with a completely overhauled and consistent API using one prefix for all APIs.
Problem description
see
- https://github.com/zephyrproject-rtos/zephyr/issues/61588
- https://github.com/zephyrproject-rtos/zephyr/issues/58007
Proposed change
Come up with a prefix and enforce it for all new APIs and APIs that need to be changed because of conflicts and misuse of private prefixes. Possible prefixes:
-
zphr_ -
zph_
Alternatives
None
For completeness, adding some to the list:
-
zphr_ -
zph_ -
zp_ -
zr_
From the TSC face to face meeting:
-
zy_is proposed -
zephyr_is proposed
another option:
-
zrt_
Note the emphasis on long-term in the proposal. Goal is to agree on the prefix and start migrating to it slowly, starting with new APIs being introduced in the tree, some APIs with wrong and misleading prefixes, i.e. https://github.com/zephyrproject-rtos/zephyr/blob/main/include/zephyr/sys/fdtable.h which is a public API using z_ and so on.
Goal is to have most APIs migrated for LTS4.
Can we also take a look at the use of __ prefixed keywords that are reserved for the implementation. E.g. __packed and __ramfunc straight up collides with IAR keywords. They are easy enough to rename, but they are used in many places, but to what? And how much breakage in modules could be expected?
Proposing.
-
Short option: z_
-
Long option: zep_
Architecture WG:
- @jhedberg talks about the need of having a prefix hierarchy for subsystems, aside from a global prefix. E.g.
zy_util_count_bits() - @nashif highlights the necessity for a global prefix in order to be able to resolve the conflicts that arise (i.e.
count_bits), once we have chosen the prefix we will be able to make progress - @carlescufi is worried about partial use of the new prefix because it may lead to some parts transitioning to it, some others may never transition to it
- @aescolar agres with @nashif
- @carlescufi states that using
z_for private functions was a mistake, and we should not exclude the option of using it and choosing another private prefix should be on the table - A discussion on the options for the private prefix ensues, with
_being put on the table again, as well as_z_ - There is a wide discussion about the benefits of using an underscore due to the clash with C libraries
Prefix options
-
zphr_ -
zph_ -
zp_ -
zr_ -
z4r_ -
zy_ -
zephyr_ -
z_ -
zep_
another idea : keep z_ for private and use Z_ for public
Also worth to keep in mind that C reserves (some) identifiers starting with underscore: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
Also worth to keep in mind that C reserves (some) identifiers starting with underscore: open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
this is how threadx and others are using something like tx or txe, so, although ThreadX functions start with an underscore, they do not follow the exact pattern restricted by MISRA Rule 21.2. _tx_thread_create() does not start with _ followed by an uppercase letter or contain double underscores (__), which avoids violating the strictest interpretations of the rule.
in any case, grepping the tree for z_ yields more than 5k ocurrances, this is the heavy lifti that needs to be done and that might be very disruptive. It is doable, but we will not be able to use z_ for public any time soon. Also, if we are going to use underscore (_) anyways, why not drop the z completely, i.e. now we have z_tls_data_size, i think _tls_data_size works and is way nicer than _z_tls_data_size.
from a practical point of view and to be able to move in parallel on both fronts, private and public we should try and pick something other than z_ for public and start using it, work on cleaning up naming of private namespace and remove z_ and use a leading underscore.
Also, if we are going to use underscore (_) anyways,
Are we going to use _ for private? :-p
why not drop the z completely, i.e. now we have z_tls_data_size, i think _tls_data_size works
Just adding 1 _ before functions is very likely to hit names used by others, including the C library. So I would very strongly advice against that.
from a practical point of view and to be able to move in parallel on both fronts, private and public we should try and pick something other than z_ for public and start using it
I agree. There is way too many internal things using z_
and pick something other than
z_for public and start using it
The second most popular option during the Arch. meeting vote is zep_
The second most popular option during the Arch. meeting vote is zep_
I voted, zp_, but would also be in favor of zep_.
why not drop the z completely, i.e. now we have z_tls_data_size, i think _tls_data_size works
Just adding 1
_before functions is very likely to hit names used by others, including the C library. So I would very strongly advice against that.
how? This is going to be internal to Zephyr, and whoever else is using it will probably have it internal to them, if we start thinking this way then we will never be able to integrate with anyone.
Btw, and for fun, someone (supporting zephyr) already uses z_ and _z_
https://github.com/eclipse-zenoh/zenoh-pico/blob/main/include/zenoh-pico/net/subscribe.h
how? This is going to be internal to Zephyr, and whoever else is using it will probably have it internal to them
We have a single namespace while linking. There is no such a thing as "internal" to Zephyr if it is linkable from other Zephyr areas.
A function like count_bits is very likely to hit somebodies else's, but _count_bits is also, because adding just a _ as prefix is way too typical. We need something much less likely to collide. The longer the better, a 3 letter combination like zep_ would be quite good.
We have a single address space while linking. There is no such a thing as "internal" to Zephyr if it is linkable from other Zephyr areas. A function like
count_bitsis very likely to hit somebodies else's, but_count_bitsis also, because adding just a_as prefix is way too typical. We need something much less likely to collide. The longer the better, a 3 letter combination likezep_would be quite good.
ok, having a distinct prefix for private/internal stuff that starts with _ might actually help with deviation from MISRA, so we could do this:
option 1:
- leave internal/private namespace as is
- use a new prefix for new public interfaces (
zep_) and migrate existing code to this prefix over time.
pros:
- we can start immeidiatly,
cons:
- We are left with two prefixes that might look like being public (
z_andzep_)
option 2:
- use _ for internal/private, ie.. use
_z_. This wil make it clear this API is internal/ there will not be confusion about thisz_being public, when used side by side with what we have below - use a new prefix for new public interfaces (
zep_orzeph_orzephyr_or ...) and migrate existing code to this prefix over time.
pros:
- less chance of mistakes and confusion between public and private
- we can start using this model immediatly
cons:
- requires mass rename of internal/private identifiers
option 3:
- use _ for internal/private, ie.. use
_z_. This wil make it clear this API is internal/ there will not be confusion about thisz_being public operating side by side with what we have below - use a
z_prefix for new public interfaces (z_) and migrate existing code to this prefix over time.
cons:
- use z_ for public will be delayed until mass rename is complete
- z vs z_ can be confusing, z_ is also short enough to conflict with others, see https://github.com/eclipse-zenoh/zenoh-pico/blob/main/include/zenoh-pico/net/subscribe.h for example.
pros: cant think of any
zep_ does not directly relate to zephyr, phonetically speaking, where the 'p' has no meaning without the 'h'.
zeph_ or zphr_, in that regard, are much better. (4 letters vs 3 is imho not an issue at all)
I would go option 2, if mass-renaming can be obtained easily via coccinelle for instance. Or else, we are going to live with 2 systems for a long long time.
I am still confused as to what is going on here.
Where did the idea or requirement come from that the internal and user APIs should be distinguished by prefix?
Why is this better than putting user headers in specific folders like uapi and not prefixing internal API at all?
What else can we do besides mass renaming to really challenge users?
(4 letters vs 3 is imho not an issue at all)
IMHO: 3 letters is more practical. Just try pronounce 4 letters zphr ;)
I am still confused as to what is going on here. Where did the idea or requirement come from that the internal and user APIs should be distinguished by prefix? Why is this better than putting user headers in specific folders like
uapiand not prefixing internal API at all? What else can we do besides mass renaming to really challenge users?
Having a common prefix is a good practice:
- act as a namespace mechanism to prevent or minimize name collisions.
- indicate which module or subsystem a function/variable belongs to.
- provide immediate context about what you're working with.
- contribute to a uniform coding style.
I am still confused as to what is going on here.
look at the introduction in this issue and linked issues.
zep_ does not directly relate to zephyr, phonetically speaking, where the 'p' has no meaning without the 'h'.
I agree with this. Both phonetically and on writing, "zep" does not immediately read "zephyr-related" to me.
zeph_ or zphr_, in that regard, are much better. (4 letters vs 3 is imho not an issue at all)
I like "zphr" in writing, but as pointed out by @butok it is very difficult to pronounce when talking about code.
I would go option 2, if mass-renaming can be obtained easily via coccinelle for instance. Or else, we are going to live with 2 systems for a long long time.
I prefer option 2 as well. We choose one prefix for public APIs and use the same prepended with a single underscore for private APIs.
zep_ does not directly relate to zephyr, phonetically speaking, where the 'p' has no meaning without the 'h'.
zeph_ or zphr_, in that regard, are much better. (4 letters vs 3 is imho not an issue at all)
I would go option 2, if mass-renaming can be obtained easily via coccinelle for instance. Or else, we are going to live with 2 systems for a long long time.
I like "zphr" in writing, but as pointed out by @butok it is very difficult to pronounce when talking about code
Your mistake is thinking logic can be applied to phonetics of english words, when actually the language is all based on the spiritual vibes the combinations of the letters makes you feel. Therefore my vote on this bikeshed would be for zep_ over anything with an h. Because the abbreviations with h is making it seem like the zephyr wind is howling through around haunted mansion, while the zep is feeling like a cyberpunk airship gliding around the sky (potentially due to being the start of zeppelin, which we can rename the project to Zeppelin RTOS while we're at all this to be phonetically consistent too). This is a technical opinion based on objective facts. Source: am a monolingual English speaker.
Having a common prefix is a good practice:
* act as a namespace mechanism to prevent or minimize name collisions. * indicate which module or subsystem a function/variable belongs to. * provide immediate context about what you're working with. * contribute to a uniform coding style.
I did not ask for best practice advice. Which subsystem in the project does not already have its own namespace prefix? This proposal will rename all C identifiers that already have their own namespaces, e.g. to zphr_sys_le16_to_cpu, for no good reason, and will end up being a huge pain for users.
C identifiers that already have their own namespaces, e.g. to zphr_sys_le16_to_cpu
Unrelated to the specific point that's being debated here, I think we should consider skipping the intermediate namespace for the existing "general purpose" namespaces like sys_* and k_* and just put those directly under the new common namespace, in order to keep the names short & simple (e.g. zep_le16_to_cpu() or zep_thread_create())
Your mistake is thinking logic can be applied to phonetics of english words, when actually the language is all based on the spiritual vibes the combinations of the letters makes you feel. Therefore my vote on this bikeshed would be for zep_ over anything with an h. Because the abbreviations with h is making it seem like the zephyr wind is howling through around haunted mansion, while the zep is feeling like a cyberpunk airship gliding around the sky (potentially due to being the start of zeppelin, which we can rename the project to Zeppelin RTOS while we're at all this to be phonetically consistent too). This is a technical opinion based on objective facts. Source: am a monolingual English speaker.
:D Not sure magic fumes would apply on naming things though, as a general rule I mean. (just to continue on that tone, in french zep has a meaning - it's an anagram - and is not so well connoted. Should we check every language actually?)
(way) More seriously, zeph is a classic short version of zephyr. The prefix would be kind of obvious. I am surprised it did not land in the list of possibilities.
@jfischer-no
Which subsystem in the project does not already have its own namespace prefix?
The majority of sys_util.h is probably the worst. The main problem I think we need to fix is that we need to stop adding new symbols with way too common names that then force users to change their current code.
This proposal will rename all C identifiers that already have their own namespaces, e.g. to zphr_sys_le16_to_cpu, for no good reason, and will end up being a huge pain for users.
I fret the idea of such a wholesale rename, and I would only consider it acceptable if we are sure we will stop this problem once and for all.
But honestly given the pain involved, I'd be happier to grandfather all the old APIs, and say that whatever we agree would apply only to new APIs, or even only for new APIs in areas which do not have any prefix.
Specially for areas we have already prefixed reasonably, I think it would be reasonable to keep using those prefixe (k_ and sys_).
I like zephyr_ best. It is by far the clearest prefix and is only 6 letters. Most editors should be able to autocomplete or tab complete it anyway. It is also what you type every time you actually talk about the project.
This proposal will rename all C identifiers that already have their own namespaces, e.g. to zphr_sys_le16_to_cpu, for no good reason, and will end up being a huge pain for users.
This is not being exactly proposed here. The proposal is: "Proposal is to pick some prefix and enforce it for all new public C identifiers, over time, (scale of years) migrate all APIs to this prefix."
As discussed yesterday, the migration part after picking a prefix and using it for new identifiers is TBD and will be done on a case-per-case bais, there will be no mass prefixing of APIs.
This proposal will rename all C identifiers that already have their own namespaces, e.g. to zphr_sys_le16_to_cpu, for no good reason, and will end up being a huge pain for users.
I fret the idea of such a wholesale rename, and I would only consider it acceptable if we are sure we will stop this problem once and for all. But honestly given the pain involved, I'd be happier to grandfather all the old APIs, and say that whatever we agree would apply only to new APIs, or even only for new APIs in areas which do not have any prefix. Specially for areas we have already prefixed reasonably, I think it would be reasonable to keep using those prefixe (
k_andsys_).
Yes, agree. If at some point we decide we need to overhaul k_ or whatever, moving to the new prefix would be considered as an option, so the move would be part of a larger overhaul and not the only goal in the case of established subsystems with existing prefixes.
This proposal is not about mass migration and renames, it is a corrective measure to ensure we do not keep digging ourselves deeper in the mud.
I like
zephyr_best. It is by far the clearest prefix and is only 6 letters. Most editors should be able to autocomplete or tab complete it anyway. It is also what you type every time you actually talk about the project.
we all like Zephyr :-)
I am fine with that as well, and linguistically is probably the least controversial, its not like there are many things called Zephyr out there that might collide ;-)
Just FYI, I've reopened a PR for count_bits using the zephyr_ prefix: https://github.com/zephyrproject-rtos/zephyr/pull/87943
The PR can be used to help determine the final choice, and probably shouldn't get merged until there's a conclusion here (although it can fairly easily be renamed until the next release if merged)